Design of a modular digital computer system by unknown
  
 
 
N O T I C E 
 
THIS DOCUMENT HAS BEEN REPRODUCED FROM 
MICROFICHE. ALTHOUGH IT IS RECOGNIZED THAT 
CERTAIN PORTIONS ARE ILLEGIBLE, IT IS BEING RELEASED 
IN THE INTEREST OF MAKING AVAILABLE AS MUCH 
INFORMATION AS POSSIBLE 
https://ntrs.nasa.gov/search.jsp?R=19800016516 2020-03-21T18:38:56+00:00Z
iDESIGN OF A MODULAR DIGITAL COMPUTER SYSTEM
FINAL SUMMARY REPORT
CENTRAL CONTROL ELEMENT
AUTOMATICALLY RECONFIGURABLE MODULAR COMPUTER
One cybj1a:ve of the ARMS (Automatically Reconfigurable Modular System)
spanecraft compute::, developed by Hughes Aircraft Company for NASA, is to
provide the capability to choose to maximize reliability through the use of
redundancy and switchable spare modules or to maximize processing capacity
by reconfiguration to provide multi-computing. Moreoever ARMS must be able
to switch from one mode to another as a function of real time requirements,
with no hardware changes, at a reasonable cost in power, weight, and volume.
A CCE (Central Control Element) module to control this reconfiguration is the
subject of this new technology disclosure. The CCE is a simplified imple-
mentation of the BOSS (Block Organizer and System Scheduler) mv;ule referred
to but not described in this disclosure.
This logic has been implemented and breadboarded un der NASA Contract
NAS8-27926 for the George C. Marshall Sp«ce Flight Center, Huntsville, Alabama.
It represents a "substantial advance in the state of the art" in that past
computer designs have allowed redundant processing, or multi-computing but
not both in the same computer with real-time mode switching. This new approach
allows using the same hardware for either reliability enhancement, speed en-
hancement, or for a combination of both rather than for just one of these
functions. This could prove very useful and cost-effective in a space mission
or in other applications having some high reliability tasks and some other
period of peak computation load during the computer's period of operation.
The ARMS computer controlled by the CCE consists of multiple memories
and CPE's (Central Processing Elements), one or more IOP's (Input/Output
Processors) and a 14aintenance/Status Panel. These modules are standard
computer building blocks with the exception of vipv it interface logic as
described in our previous New Technology Report. Each module contains in-
ternal detection logic utilizing redundancy and error detecting and cor-
recting codes in keeping with standard techniques used in many modern computers.
('NASA-CH-1614-57) DESIGN OF A MODULAR	 N80-25009
DIGITAL COMPUTER SYSTEM Final Summary
Report (Hughes .Aircraft Co.) 48 p
HC A03/M'F A01	 CSCL 09:B	 Unclas
G3/60 21023
Page 2
Thus CCE and interface logic concepts implemenred in ARMS could also be applied
to other general purpose commuters needing ARMS attributes. ARMS modules can
be configured for simplex, duplex, redundant or triply modular redundant (TMR)
operation.
M&S Computing, Inc., of ftntsville, Alabama, a subcontractor to Hughes
Aircraft Company of this contract, was responsible for ARMS software develop-
ment. No new technology was discovered in the course of this subcontract.
THE ARMMS COMPUTER
Any computer system justifies the cost of its Oevelopment to the degree that
it provides new capabilities or allows earlier ones to be satisfied at reduced
cost. The Automatically Reconfigurable Modular Multiprocessor System (ARKMS)
is primarily oriented toward providing the following new capabilities for
	 f i
spaceborne computers for application in the 1980 to 1985 time period.
1. To provide a modular computer system which is responsive to many
mission types and phases.
2. To achieve through modularity a higher computing capability than
previously available for spaceborn application. A target of several
million instructions per second has been chosen.
3. To provide the capability to choose to maximize reliability through
the use of redundancy or to maximize processing capacity through
multiprocessing. This multi-mode capability must be dynamic, that
is, a given system may alternate from one mode to another as a
function of realtime requirements.
4. To maximize reliability in all applications through the incorporation
of fault detection and recovery features and through the use of high
reliability components,
The first consideration of any ARDDIS design tradeoff is to avoid
compromising these 'basic objectives. However,, continuous concern must be
maintained for the practical requirements of implementation.
.....,ma"
IPage 3
ARMMS is an outgrowth and extension of two NASA development programs,
the MSFC Space Ultrareliable Modular Computer (SUMO) and the ERC Modular
Computer.
AM24S consists of a grouping of Central Processor Elements (CPE's), 1/0
Processor (IOP's), Memory Modules, and a Block Organizer and System Scheduler
(BOSS) module that will execute software routines for data and I/O schedul-
ing, interrupt processing, system test, repair, and configuration, and power
and clock switching and distribution. The IOP's and CPE's, and BOSS are
connected to the memory modules by 4 pairs of buses as shown in Figure 1.
One of the to„aghest challenges ARMMS faces is rapid reliable reconfiguration
at a reasonable cost in power, volume, and complexity. A system of processor
and memory interface logic that accomplishes this is the subject of this new
technology disclosure.
INTERMODULE INTERFACE APPROACH
An intermodule interface has been designed that allows any CPE, IOP, or
BOSS module to address any non-protected memory page. It allows any combi-
nation of simplex, duplex, or TMR streams with any combination of relative
priorities to coexist with minimum bus contention, providing that no more 	 j
than 4 CPE's, 4 IOP's, and BOSS are involved simultaneously.. Volatile storage
defining a module's role in ARMMS has been minimized and can be coded such
that transients cannot cause an undetected change in the module's status.
The interface allows all modules of a class (CPE, Memory, etc.) to be
virtually identical. Interface gate complexity and module-to-module inter-
connections have been minimized.
i
Whenever a stream is formed, BOSS sends each processor module involved
a stream status code defining all bus connections within the stream and
that stream's priority. (Mce assigned to a stream, a processor always uses
I
the pair of buses specified by the stream status code for communication to
and from memory, eliminating bus contention among processors of a given type.
For redundancy, each processor can output on a choice of two buses. This
choice is made by BOSS command. To reduce bus contention between processors
of different types, a hierarchy is established such that 1/0 and BOSS modules
can inhibit CPE modules from starting a new memory access cycle when the 	 '"?
Page 4
former modules require access to a memory bus. Similarly, BOSS (but not CPE)
modules can inhibit I/O modules' bus access. Once any module has been granted
access, it will continue to have it until transfer of the word involved has
been completed. Usually, only processors using buses needed by other processors
are inhibited, except that all processors operating synchronously in a duplex
	 i
or TMR stream are inhibited if one or more processors in the stream are in-
	 ]
hibited ensuring maivtyenance of synchronization between these processors.
Modeling indicates that speed lost due to bus contention between processors
of different types should be less than 3% exclusive of memory contention losses
that are independent of the interface design.
BOSS assigns each memory module a page address and a high, middle, or low
Y
bus response assignment in case of memory accessible by a TMR stream (or a
high or low assignment for access by a duplex stream). Memory page size will
E
equal memory module size. All memory modules assigned to a given page output
on the same bus when accessed by a simplex stream or on different buses accord-
ing to their bus response assignment when accessed by duplex or TMR streams.
Examples are shown in Figures 2 and 3. All duplex or TMR stream processors
receive memory outputs on all buses assigned to that stream. Each processor
access request contains a page address and a bus priority code. Processors
will continue to request access until it is granted or until they are temporarily
inhibited by other processor's desire to access.
The assignment codes discussed above require 6 bits from BOSS to memories,
and 9 bits from BOSS to processors, plus extra bits for error detection coding.
Each module input interface includes voting and fault detection coding logic.
These interfaces can be implemented at an estimated complexity of 1000 gates
per module.
The ARMMS priority structure will involve both hardware and software
elements. The hardware recognizes a minimum of 16 different priority levels.
The software then selects different subsets of these 16 as program requirements
dictate. The highest hardware priority goes to BOSS, since the efficiency of
the rest of the system depends on BOSS completing its tasks efficiently. The
second highest priority is a special TMR CPE mode used only ir. the event of an
error in one of three TMR channels to ensure completion of the TMR task with
maximum speed prior to initiating diagnostic tests on the stream. The next
seven priorities are for 1/0 streams on the assumption that the timing of
external events happening and mass data transfers is more difficult to control
44 	 Waffiahm"
I	 .
Page S
than the timing within processing streams and, hence, IOP memory access
requests should be given higher priorities than CPE access requests. The seven
lowest priorities are for CPE's. Different numbers of arrangements of priori-
ties could be easily implemented if required.
So long as BOSS, I/0, and CPE programs are mostly segregated into
different memory pages, all 3 types or programs should be able to be executed
simultaneously with minimal bus or memory contention. When these programs
wish to access the same memory page, the internal logic design of the memory
access logic will tend toward letting the streams access the memory a word
at a time in turn, since each processor will release the memory temporarily
between access requests, letting the next higher priority stream gain access
for one word. This results in all contending streams slowing down, but none
stopping entirely. Obviously, this does not preclude the need for designing
the software to minimize memory contention if ARMMS is to perform efficiently
as a multiprocessor.
The seven priority levels available for normal I/O and CPE scheduling are
ordered in descending priority as shown in Table I, allowing the 14 modes listed
in the table. The logic allows any of the combinations listed for CPE's to be
used simultaneously with any of the combinations listed for IOP's. Note that
the choices allow for any combination of relative priorities between streams
of differing criticality, and that the software system can change the priority
assignment of a given stream at will; also, that combinations such as 2 duplex
IO streams and a simplex plus a TMR processing stream are allowed. If IOP's
and CPE's are to be tied together in the concept of "full processing stream"
via software, both processor types could be given either the same CPE or the
same IOP priority assignment by BOSS. Otherwise, BOSS assigns IOP's only I/O
priority codes and CPE's only CPE priority codes, and the hardware provides
for complete independence of the I/O and processing streams subject only to
software restrictions.
In order to access data from memory, a processor must provide a 4-bit page
address to select one of 16 memory pages, a 4-bit priority request to allow
the given memory page to choose the highest priority stream's request and
determine if the correct number of processors agreed on this request, the number
being determined by the priority's mode (simplex, duplex, or TMR), a 3-bit 2 out
of 3 coded Read/Write/Transfer request, and a 13-bit word address to select one
of up to 8,192 words in a memory module. The first 8 of these 24 bits must be
Page 6
present for a memory to make a decision as to whether or not to grant the request.
In addition, a sync or "access request" signal must be present to tell the memory
that it is supposed to be making such a determination if these 8 bits are to be
transmitted ;)u lines that can also carry word addresses and data that might
otherwise be confused with page and priority information. The processor to
memory bus must be at least 8 bits wide plus the access request line and any
desired parity lines in order to function efficiently.
In addition to data lines, if the buses are less than a full word wide
the memory to processor bus must contain a dedicated memory response line to
signal the processor that the first bits of address have been accepted and
the processor is to continue the transmission to completion. If a processor
does not receive this response signal, it will continue to transfer the first
bits of the address to the memory interface until either the processor is
inhibited by another processor or the memory responds to the data. Since only
one processor can use the bus at a given time, all requests and responses are
unambiguous.
Three additional lines are required in connection with the memory buses
at the processors only. Each processor receives inhibit lines from each of
the other two classes of processors and sends an inhibit to these other two
classes, describing each processor's bus activity. In addition, an I/O busy
line may be required from IOP to CPE in the event of several CPE's wishing to
access a given IOP simultaneously. This will depend on the details of the
IOP's and is shown for completeness. Note that the BOSS module receives the
IOP's Memory Access Request as an inhibit rather than the IOP's normal inhibit
line which does go to the CPE. This is because the IOP's memory access request
line will not go true until all buses needed by the IOP have been cleared of
traffic and, hence, this line will inhibit BOSS only in the event that the IOP
can gain access to the memory through use of free buses or inhibiting CPE's,
maintaining BOSS priority over the IOP. The information to be transferred to
or from a memory by processors is summarized in Figure 4, assuming a 32-bit
data word plus 7 error correction code bits and a 13-bit bus width.
INTERFACE LOGIC DETAILED DESIGN
Within each processor (BOSS, CPE, or IOP) is an access request network
that will request memory access whenever an appropriate bit appears in the
\=	
...	
-.,. ^..... .w-s .,tea..-.__.
	
-.v.. ...x ...
	
...	 .. ..	 ,F
Page 7
processor's microprogram, subject to the inhibitions (BOSSINH, etc.) from other
processors. Figure S shows a gate level drawing for this logic in the case of
the CPE module. Logic for IOP and BOSS is similar and is shown in Figures 6
and 7. The choice of inhibiting factors is controlled by the Stream Assign-
ment Register in the CPE or by hardware connections in BOSS and IOP, with BOSS
having highest priority to memory, IOP middle priority,, and CPE lowest priority.
The logic also correlates memory responses (MEMRES) to its access request
(MEMREQ) and, when a response from the correct memory modules occurs, sets a
flip-flop (AGF) allowing the access to go to completion and inhibiting other
access to the bus until the cycle is comylette.as signaled by a second micro-
program bit within the processor. IOP and BOSS access con c,	logic differs
from that of the CPE only in that an Access Request Flip-Floe is incorporated
(ARF) which locks out lower priority modules from accessing memory while these
higher priority modules are requesting memory access. All modules can lock
out others while they are actually accessing memory instead of merely request-
ing it.
Figure 8 gives a detailed view of the logic within each memory module's
access control block. Figures 9 and In show the same logic at a gate level.
As the data comes in on each bus, buses whose access request lines are true
and have page addresses agreeing with a memory module's page address (PGID)
will be tested for access to the memory registers. The 16 priorities, (Ai..Pi)
are decoded and applied to the request detection and priority ordering logic.
If this circuit detects the correct number of requests of the highest priority
present at the time of the test (BOSS ... CPE SMPLXD) and the memory is not
already in use (DS 1 ... DS4-0), the memory responds (RS 1WMEM RES) on the buses
assigned to the processor generating the request and gates the response
decision into the Response and Criticality fields of the Assignment Holding
Register and to the voting logic to allow the voted data to go to the memory
registers and to set up the proper output bus paths for the memories' data
input in the case of a Read. When the cycle is complete,the Response and
Criticality fields of the Assignment Holding Register are cleared, and the
memory is ready for the next access. The bus output mode field determines
which of 3 TMR buses a memory module will output in TMR according to an
assignment from BOSS.
Each module contains voting logic which will vote any combination of 3,
compare any combination of 2, or transfer any one bus's inputs to an appropriate
"^:
fPage 8
module register, signaling any disagreements to the module's status/command
network which will interrupt BOSS as appropriate, in processor modules, the
voter paths are controlled by the Stream Assignment Register, while in memory
modules they are under the control of the Response and Criticality fields of
the Memory Assignment Holding Register. This logic allows for maximum soft-
ware flexibility in the ARMMS configuration process with a moderate amount of
hardware.
Figure 11 shows 3 simple circuits for interfacing with bused data. The
first allows masking of "stuck on 0" failures in the duplex and TMR modes on
the assumption that the transmitting module was designed to transmit "O" when
it had detected an internal failure. This circuit could then be followed by
error detection or correction logic. This circuit also allows straight -through
transmission of simplex data. The second circuit is a basic voter for use in
TMR only. It does not allow error detection; only correction. The first and
second circuits could be used together for a full simplex, duplex, TMR capa-
bility. The last circuit provides a fault detection add-on, for TMR only, that
signals a fault when no combination of 3 bus inputs agree. The principal ad-
vantage in this circuit is that while it detects faults, it does not say which
bus was at fault.
Figure 12 shows the voter/switch used in baseline ARMMS. It incorporates
all the features of the three circuits discussed above, plus allowing .fault
isolation to a specific bus, This circuit normally allows ORing together any
enabled bus signals as in the first circuit above. Simultaneously, it votes
on the enabled (DS i ) data inputs in TMR and generates a fault signal (FLTi)
for any enabled bus input that disagrees with other enabled inputs. This
fault signal is output to the module's fault control logic and is used to
prevent that bus's data from passing through the data-ORing section of the
voter switch.
The intermodule interface circuits described have a gate delay of 17,
including 5 in the voter switch, 2 in the processor access control logic, and
10 in the memory access control logic. This amounts to a 51 nsec propagation
delay, assuming a 3 nsec average delay per gate for LSI silicon-on-sapphire
CMOS logic. For a 10 MHz data bus transfer clock rate, this would leave 49
nsec for bus driver, receiver and transmission delays.
Page 9
Central Control Element (CCE). The Central Control Element distributes power
and clock signals to other ARMS modules and coordinates ARMS reconfiguration
either due to new assignments from the maintenance /status
 panel or in response
to fault interrupts from other ARMS modules. In order to minimize costs the
breadboard CCE does not include redundancy that could be implemented in a
flight version. For maximum reliability a TMR CCE with voting between the
parts on all outputs would be desirable.
The CCE consists of individual status controllers for each ARMS module to
be controlled fault correlation logic, an overall program initiator and re-
configuration controller, switching logic for power supplied to other modules,
a crystal controlled central clock source, and external interrupt routing logic.
The CCE has no internal processing or main memory bus access capabilities but
is capable of utilizing CPE software or hardware to Enhance its own hardwired
capabilities by means of interrupts. A block diagram of the CCE is shown in
Figure 13. The following is a description of the specific enbodiment of the
CCE used in the ARMS breadboard:
	 '
CPE Module Status Controller. One CPE module status controller is required for
each of the 4 CPE modules in the ARMS breadboard. Each controller keeps track
of the CPE's status (spare, active normal, active abnormal, failed) outputting
a stream assignment bit corresponding to that CPE's hardwired processor (to
memory)'bus. Together the 4 CPE module status controllers provide a 12 bit
stream assignment to all CPE's identifying which CPE's are active and which
are passive. When the CCE is powered initially, each CPE module status controller
places its CPE in the spare state. A signal from the maintenance/status panel
causes one or more of these controllers to place their CPE's in the "active
normal" state. If a fault interrupt from either a CPE or a memory module
indicates that a specific CPE may have failed that CPE's status controller is
placed in the "active abnormal" state. Figure 14 shows the various states that
a module status controller may take on.
If the CPE is operating in the simplex mode when the fault was detected,
or if it is operating in the duplex mode and the fault is detected by a memory
module without being internally detected within the CPE, the CPE module status
controller causes the Program Initiator and Reconfiguration Controller (PIRG)
logic discussed in the next section to issue a stop CPE interrupt immediately,
If the CPE is operating in the TMR mode when the fault was detected, or if it
Page 10
is operating in the duplex-mode and the fault it internally detected Within the
CPE, the controller issues a stop CPE interrupt immediately following a receipt
of a CPE available /rollback pace signal from the CPE, or after a prescribed time
interval, whichever is shorter. Once in the "active abnormal" state one of the
following events occurs in the CPE module status controller:
(a) If the CPE issues a CPE available/rollback pace signal prior to
receipt of another fault interrupt concerning this CPE the status
controller returns the CPE to the "active normal" state.
(b) If another fault interrupt concerning the CPE is received prior to
receipt of the CPE's available /rollback
 pace signal the controller
enters the failure pending state. From this state the reconfiguration
controller either replaces the faulty module if it has sufficient
priority and a spare is available, transferring its assignment to
the spare CPE and causing the CPE module status controller to place
its CPE in the failed state, or otherwise the reconfiguration
controller returns the CPE module status controller to the active
abnormal state. Thus modules that cannot be immediately replaced
continue to be retried, and ARMS continues to operate in the presence
of maskab le failures.
Fault interrupts from IOP or main memory modules cause issuance of a stop
CPE interrupt immediately if the CPE is operating in the simplex mode or im-
mediately following receipt of a CPE available/rollback pace signal from the
CPE if the CPE is operating in the duplex or TMR mode. The CPE module status
controller remains in the active normal state during this operation in the absence
of a fault interrupt placing blame on the CPE. The CPE module status controller
also issues stop CPE interrupts prior to any external command update of assign-
ments from the maintenance/status panel or due to an emergency such as an im-
pending power failure.
IOP Module Status Controller. The IOP module status controller design require-
ments are similar to those for the CPE module status controller with the
following exceptions:
(a) A stop IOP interrupt will not be issued unless the IOP does not
stop within a prescribed time interval after all CPEs have halted.
(b) An IOP will not be returned to the "active normal" state from the
Page 11
'"active abnormal" state unless all active. CPEs issue CPE available/
rollback pace signals prior to the receipt of another fault interrupt'
concerning this IOP.
Main Memory Module Status Controller. One again memory module status controller
will be required for each of the 4 main memory modules in the ARMS breadboard.
These controller's design requirements will be similar to those for the CPE
module status controller with the following exceptions:
(a) A stop memory interrupt is not required.
(b) A main memory module will not require stream assignment status bits
but will require page address and output bus assignments. The output
bus assignment determines if a memory module will transfer data to
CPEs or IOPs on the lower, (middle), or upper numbered memory (to
processor) bus paired with the processor (to memory) buses to which
accese was granted. An "essential/non-essential" memory status bit
is also required internal t^ the main a;emory module status controller
to determine the proper mex6 ry replacement algorithm for the recon-
figuration controller in response to a memory fault interrupt. An
essential memory contains programs and important data the loss of
which could disable a stream. A non-essential memory contains
working storage and other contents the loss of which would not disable
a stream.
(c) A main memory will not be returned to the "active normal" state
from the "active abnormal" state unless all active CPEs issue CPE
available/rollback pace signals prior to the receipt of another
fault interrupt from this memory.
Program Initiator and Reconfiguration Controller. The program initiator and
reconfiguration controller (PIRC) restarts the ARMS CPEs initially, or if they
have been stopped for any reason, and controls the transfer of status assign-
ments between individual module status controllers when ARMS reconfiguration
is required.
The program initiator logic is activated whenever a load request is re-
ceived from the maintenance/status panel, any faults are detected, or CPE.
available/rollback pace signals are not received from all CPEs within an
a
Page 12
interval timed by the PIRG logic. The various states that the PI1tC logic can
assume are shown in Figure 15. Once activated, the program initiator logic
issues stop interrupts to the CPEs as discussed in the previous section, issues
a panic halt signal to all CPRs and IOP, waits for CPE available/rollback pace
signals from all CPEs and an 'iOP available signal to stabilise in the available
states, and °hen takes one of the following actions in descending priority:
(a) In the case of an essential memory failure in the duplex or TMR mode
the program initiator logic issues a clear memory itt"trrupt to the
questionable memory, forcing its output to "0" pending completion
of initialization, followed by an initialize memory i.terrupt, along
with control information specifying the memory page to be initialized,
to the highest priority CPEs. These CPEs enter a program that alter-
nately reads from and then writes into every word in that memory page
duplicating data from the good memory(s) into the newly assigned
memory. All zero output conditions from the memory being initialized
shall be Considered to be normal until this operation is completed as
signaled by a rollback pace signal from the CPE in question. Upon
receipt of this signal the program initiator logic issues start
interrupts to any remaining active CPEs if more than one processing
stream is used in ARMS and restores the newly initialized memory to
normal operation. Upon completion the memory initialization program
automatically returns to the appropriate rollback point of the
program in progress at the time of the interrupt.
(b) in the case of any other failure the program initiator logic issues
start CPE interrupts to all active CPEs causing them to return to
the appropriate rollback point(s) for the program(s) in progress at
the time of the interrupt.
Figure 16 shows the PIRC logic necessary to respond to CPE rollback pace
signals and to issue the interrupts discussed above. The reconfiguration
controller controls the transfer of status assignments between individual
module status controllers in response to commands from the breadboard's
maintenance/status panel or to any of the individual module status controllers
entering the failure pending state. Transfers of status assignments from
failed active modules to newly activated spare modules occur once the program
initiator logic verifies that the IOP and all CPEs are available (i.e., stopped)
and prior to issurance of any interrupts by the program initiator logic with
the following restrictions;
w	 _
Page 13
(a) only one module of any given type can be replaced at a time and a
spare module of that type must be available. For example, one
memory plus one CPE may be replace j
 but not two CPEs at one time.
If two CPEs did fail at once, one would be retried a second time
and if it still malfunctioned and an additional spare CPE was
available it would then be replaced.
(b) Essential main memory modules optcrating in simplex cannot be re-
placed by spares since no mechanism for initializing them is
available. A permanent failure in such a memory module requires
outside intervention for correction.
The logic for transferring assignments between status controllers is shown
in Figure 17.
Fault Correlation Logic, The fault correlation logic allows the CCE to maxi-
mize the probability of correctly isolating a fault to a specific ARMS module
within limitations dictated by a reasonable level of hardwired logic cotr°''^xity
and allows the CCE to determine that certain faults are maskable so that
critical programs can continue to completion. The CCE correlates received
fault interrupts from each CPE, IOP, and main memory module with appropriate
status information from their status controllers as shown in Figure 18.
Many CPE and IOP faults may be isolated due to fault interrupts from
the module in question. Single memory module fault interrupts indicate
failures within the interrupting memory. In duplex and TMR modes simultaneous
fault interrupts from two or more memories can isolate a failure to a CPE or
IOP module whose identify is encoded in the interrupt. In the duplex mode
these interrupts may only isolate the fault to one of two CPEs or IOPs in
the absence of a direct fault interrupt from the offending module. lowever,
an arbitrary replacement of one of these modules provides 50% probability
of success in cases that f:therwise would result in an ARMS system failure.
In simplex mode detecF;abve faults (other than maskable single bit failures
within main memory modules) result in immediate rollback or replacement of the
offending module. In the absen^e of a fault interrupt from the CPE or IOP the
fault is blamed on non-essential memories or on the CPE or IOP accessing; an
essential memory in the case of an ambiguous fault. If a fault is unambiguously
isolatable to an essential memory the fault is insolvable since no mechanism
iPage 14
e,xisto for initializing a spare in this mode. Some faults may be undetectable
in simpler mode.
In duplex mode virtually all faults are detectable and at least those
detectable in simplex allow the program to continue to its next rollback point
and then are correctable in real-time through reconfiguration so long as spare
modules are available. In all modes ARMS breadboard is capable of continued
computation in the presence of faults so long as these faults are Maskable.
The choice between rollback and continued computation is software determined
in that it is dependent upon whether the program is stopped before or after
the progran status block is updated. If the block has been updated the next
program is executed, if not, then the present program is repeated. Programs
shall be constructed so that they can be repeated if necessary.
Power Switching Logic. The CCE distributes power to all other ARMS modules.
The power switching logic provides power to each ARMS module whose individual
status controller places it in either an "active normal", "active abnormal",
or "failure pending" state.
Crustal Controlled Block. The CCE contains a crystal controlled oscillator
providing central clock signals to all ARMS modules to assure their synchroni-
zation.
External Interrupt Logic. The CCE holds external interrupts when they are
received and routes them to the CPEs for which they were intended. When a CPE
responds to a given interrupt it sends a response to the CCE which clears the
interrupt once it receives response from a majority of the CPEs to which the
interrupt was sent. As in the case of the power and clock distribution ex-
ternal interrupts are routed through the CCE since it is the only clement in
ARMS which remains stable throughout system reconfiguration. Clock Distri-
bution and External interrupt logic is shown in Figure 19.
Page 15
CCE Technology* and Component Count. A CCE has been breadboarded out of T 2 L
small scale integrated circuit logic. For maximum reliability it should
ultimately be implemented with CMQS LSI technology. Table 2 shows Rhe number
of gates and flip-flops required by each part of the CCE. Clearly the CCE
complexity would increase for larger numbers of controlled modules but for
ARMS it contains less than 1200 equivalent gates and is simple enough to be
readily implemented on 2 or 3 ?arge scale integrated circuits if desired.
Page 16
TABLE 1.	 .ARMMS PROCESSOR PRIORITY ASSIGNMENTS
Priority Proc. Stream
Code Type Criticality
1. (Highest)	 0000 BOSS TMR
2. 0001 CPE TMR (Special)
3. 0010 To SIMPLE? A (SA)
4. 0100 10 DUPLEX A (DA)
5. 0110 10 TMR (TR)
6. 1000 10 SIMPLEX B (SB)
7. 1010 IO DUPLEX B (DB)
6. 1100 IO SIMPLEX C (SC)
9. 1110 10 SIMPLEX D (SD)
10. 0011 CPE SIMPLEX A (SA)
11. 0101 CPE DUPLEX B (DB)
12. 0111 CPE TMR (Normal) (TR)
13. 1001 CPE SIMPLEX B (SB)
14. 1011 CPS; DUPLEX B (DB)
15. 1101. CPS: SIMPLEX C (SC)
16. (Lowest)	 1111 CPE SIMPLEX D (SD)
NOTE: IN A FULL PROCESSING STREAM AN TOP MAY BE GIVEN
THE STREAM'S CPE PRIORITY CODE.
10P AND CPE STREADI.S MAX' INDEPENDENTLY HAVE THESE 14 MODES:
4 Processors
(SA, TR) or (TR, SB)
(DA, DB)
(SA, SB, DB) or
(SA, DA, SB) or
(DA, SB, SC)
(SA, ... , SD)
3 Processors
(TR)
(SA, DA) or (DA, SB)
(SA,..., SC)
2 Processors
(DA)
(SA, qB)
I ,Processor
(SA)
k,
4
Page 17
_._...:..«.ter
TABLE 2
CCE COMPONENT COUNT
Function Gates Tl ip/Flops Total
Equiv. Gate
1. CPE Status Controller 131 28 299
2. IOP Status Controller 51 8 99
3. Memory Status Controller 159 36 375
4. Program Initiator/
Reconfiguration Control 119 13 197
5. Fault Correlation 134 0 134
6. Clock Control /Distribution 14 4 38
7. Ext. Interrupt Logic 20 2 32
628 91 1,174	 I
i
i
s
t
j
32lsa
MAIN MAIN	 MAIN	 MAINMEMORY MEMORY	 MEMORY	 MEMORY
moss l a	 i I•TO/FROMMODULEBUSES(^) PROC TO MEM
•USES I4)
I
^o,	 PART I
BOSS PART 2
•
•
I 9OSS PART N	 I
I	 ' MEM TO PROC.
•USES (4)
i	 MVL
IOPI CPE,IOP	 CPE	 ^^•...
SYSTEM INPUT !US
SYSTEM OUTPUT BUS
Figure 1. ARMMS System Configuration
REDUNDANT CONNECTIONS	 C CONTROL FROM BOSS
	
S STATUS TO BOSS
Fi-ure 2.ARMMS Processor/Memory Interconnections — 1. Processor B Access to Memory Y In Simplcx
32140-10
•
REDUNDANT CONNECTIONS
	 C CONTROL FROM BOSS 	 S STATUS TO DOSS
Figure 3.ARMMS Processor/Memory Interconnections —11. Processors A, S, D Access To Memories X, Y, Z
X,Y,Z1nTMR
rPWl•
PROCESSOR TO MEMORY WS.
► P PRIORITY CODE RKAO/WRITEJ PAGE ADDR, WDA(SEE TABLE IV) XFER CONTR.
• PARI . V 4	 3_	 1	 i	 a	 1	 MS/ i	 IN IRo
ACCESS REQUEST JAM)
(DED ►CATEO LINE)
Wmmr% Annnrec arn&k
(WRITE OR XFER ONLY)
PW2	 0	 P	 DATA
032	 027 CK6 1 026	 022
PW3	 0	 DATA
021	 t	 I	 I	 I	 I	 I	 1012 1 CKS 1 011 1 010
0	 DATA
Of	 DS I CK4 1 04 1 03 1 02 1 CK3 1 01 1 CK2 I CK1
14	 13	 12	 11	 10	 0	 6	 7	 6	 6	 4	 3	 2	 1
MEMOI
MW 1.
MW2
MW3
iOCESSOR BUS.
	
P	 DATA
	
032	 027 1 CK6 1 026	 D22
z
DATA
	
021	 012 1 CKS Oil I O30
DATA
	
OG	 DS 1 CK4 D4 D3	 02 CK3 01 CK2 I CK1
	
13	 12	 11	 10	 9	 6	 7	 6	 r	 4	 3	 2	 1
FIGURE 4. AM IS MVFX/PRO =SOR WORD FORtMS
BOSS INHA
BOSS MEMRQA
INTERCONNECTS
B	 C D
2	 3 4
3	 4 1
4	 1 2
1	 2 3
CPE INHA
IO MEM RQA
MEM RES A
CPC INHB
IO MEM RQB
MEM RES B
CPE INHC
IO MEM REQC
MEM RES C
CPE INHD
IO MEMRQD
MEM RES D
BOSS
STREAM ASSIGN
FIGURE 6. BOSS MODULE MEMORY ACCESS CONTROL LOGIC
COMPLEXITY 17 GATE, 2 FLIP-FLOP,#--.20 PINS
.^
	 m
FIO INHA
CPE INHA
BOSS INHA
HEM RESA
IO MEMRQA
CPE INHB
BOSS INHB
MEM RESB
IF
CPE INHC
BOSS INK
MEM RESC
CPE INHD
BOSS INHD
MEM RES D
ED
3NNECTS
C D
3	 4
2 2
	
3	 4	 1
3 3
	
4	 1	 2
4 4
	
1	 2	 3
ABCD
STREAM ASSGN
W
*n nre*nr
FIGURE 7 I-0 MODULE MEMORY ACCESS CONTROL LOGIC
Complexity, 17 gates, 2 flip-flop, ':20 pins
PRbt• TO
MEMORYtutu
PRIOAITY COO[ is
QRQUSSTTV
^LCOO946•
•
•
PR IORITV COO,..
•
•
►AIORITV COOL i/ L
.Act
tNA9LC
•
• •tt1Y[tT •1106TtCT10M •
• PRIORITY •
• OROtRNWi •
•
1 ^IC
•
• !PC •N	 • tMll•
•
MrMORV
MCRITICALITY	 1 NITItIM►L[XOUPL99 TWO)
A^	 •
i[	 RKtLMONt[	 •
•[N[IIAT011[	 •	 Ail
I TO
N
rrMORY Au•onsc
	 •TO ACCpt 499UE5T
PAGE ^0011. • I
• 1^DOA. l 1 IPAGE _1
PAGE •
Aoollcts 051 •
• COMPARATORS •
• (•) _ • M[MORV TO •
• ►RDCCStOA • M[MOAV• 1a tU&SELECT OUTPUTPAGE AOO11'.• LOGIC • tut* s
• "I ••
too
•05•
Aa[ q0 R 3• •
•ACCL11 AEG.
•
••	 o INPLU[C,aja$Sol
• •
tut OUTPUT MODEILO,MIO,M1)
• • • •
PAGE	 CRIT Rasp, s tom AYIONMLNT
(M I	 I/	 I PP + PP HOLDING
MEMORY
PROM •
9051 •
COMPLEXITY • 584 DATES, It FLIP-FLOP, to [XTCRNAL CONNECTIONS,COULD t[ Two Lilt•,
Figure 8, Memory Access Cbntrol.Logic - (16 Priority Levels)
z	 °^
F Pi
LO)
"I
au
c
a
a
0
OC to
a Q
^e x
z
Fes+
w^
F
La:
A
f	 ^
^ G
C
O O
DC v
a
cc: c
c
c:
rr
C
c
G
c cr
f-w
a ^:
a
^ a
o+
wa
k,
aL•
z
i
L	 o	 o	 9
w,•
c a
x
G. OG
wa
cs ^,
v^
V:
L ^
x acc wc a
4
0C
w
a
w
aw
e-1	 N	 OG
CC	 .-1 cL'	 N N ^	 r1 ^	 vT
a	 w ate.	 a ¢ a	 n.	 a.
F
c .^
^	 a
a
a
x
u:
v: v
sr, w
^: c
F
^: a
c
c a
a w
a
cc
^	 ca
c
u	 ^'II
E U
.1T	
—a".
   
DS
DS
BE
DUPA
10
INTERIMEDIATn
:OR7'"Y ACCESS
QUESTS
SMPLEX
CPE
:ST PRIORITY
S REOUEST)
FIGURE 10 MEMORY MODULE ACCESS CONTROL LOGIC - DETAIL II
REQUEST DETECTION AND PRIORITY ORDERING LOGIC (ONE PER MODULE)
NOTES,
1. ALL SMPLEX DECODING IS DONE AS IN "SMPLEX I0", ALL DUPLEX DECODING
AS IN "DUPA FU", AND ALL TMR DECODING AS IN " DIR(SP)CPE".
2. ANY GIVEN PRIORITY LEVEL RECEIVES INHIBITS FROM ALL HIGHER PRIORITY LEVELS.
3. SUBSCRIPTS REFER TO BUSES - I,O: C4 IS SIGNAL C FOR BUS #4, ETC,
DATA 1
BUSEN 1
DATA 2
BUSEN 2
DATA 3
BUSEN 3
DATA 4
BUSEN 4
MASKED DATA
,IT
(2) VOTING ONLY (TMR)
BUSEN 1
BUSEN 2
BUSED 3
DATA 1
BUSEN 4
VOTED DATA
IIT
FAULT TO FAULT CONTROL LOGIC
sIT
DATA 2
DATA 3
DATA 4
(1) MASKING/SWITCH ONLY (SIVYLEX, DUPLES, TMR)
3
DATA 2
DATA 3
DATA 4
(3) DETECTION ONLY (TMR - NO ISOLATION IS AN INDIVIDUAL BUS)
DATA 1
Figure 11 MODULE INTFY.--,iS VOTING, MASKING & ERROR DETECTION LOGIC
3: un.l
FLT I
FLT2
Rif	 4 OUTDATA I I I	 I	 I
DSI
FLT3
DATA 2
DS2
DATA 3
DS3
FLT4
DATA 4
CIS4
TMR
DATA I....DATA 4 DATA FOR EACH OF 4 BUSES
DSI,,.,DS4 BUS SELECTION LOCK OUTPUTS (RML)
TMR TMR MODE SELECT SIGNAL (I a TMR, 0 n SMPLX, DUPLX)
OUT SIGNAL OUTPUT TO DATA REGISTERS
COMPLEXITY n 25 GATESMIT 4 LINESMIT • S RAILS
Figure 12. Universal Bus Voter/Switch (one Bit Slice - 13 Required Per Module)
wwU wW 0^W cs N S a
t
^ a	 < 116, 5^ CifW
W^
U Q
H WT
a 5 a	
`'1
91
c
o
w
N ^
o	 ►-+ ►-+
H F
a
^
L)	 • 3 rz-+	 f-4 p" W.a^ ^C H
Z z d z 6 O104 F FV 4cy U k W
0
F F V
o
w0
L
cUa
a a
a
r^
LO
I
x
IWQ ^ ^ H
a ^^ 
^
P-
•
^ ^ fn I
._	
c a
C •
H
d
to ^W •
to F FO z z
• zW ^ V
LSw u to
P+ Lr <
< Ln
a
0
w Can. w
Ua. W
cn tr 0,
• Ua w z
H
Uc ^:;
Ln
z
c
U G C
zA W
N04 .W.^ ... ...
F O <
a :z X a4
O	 En cn
F-
zz L) 0-• F a y
W	 (^ ^ C ►^-i^ W W OFW z
Alk
a
c
c
W
w
a
C:
W
H
a
F ^
E-+aOd
L^.
zC
0"
N
o	 z
aL)
z a a
W0 a
zN x
.auz
	 w•
z to U	 z
tiT
O
U V z U W
A
w
H
W
F
c
Wa
w a
o	 °o
w w
W	 F.
R:	 U W
6	 Ua	 a<
cn	 c a
w
2	 G]
Cl!
va
k3	 a. W
a	 a c
z	 w^^
C	 ^. C CU	 U U
V:
V:W
r7
AA
O ^
w F
pi U
a^
w va
ww°
wFC
Z
i
A
zc
F
v.
a
F
WF
FN
aW
caF
z
O
U
A
F
<
F
L'
CO
aU
J
w
xc
u
G
A
QF
E O
^z '
az iz
C
G w `
WEo
a.ax
U W
W ^ C^ N C
W ra 	^-+	 W
zza. F ^aWw
L)
U z.^c^
W
C
^a
aw
w
c^
xF
E
W
a
Crc
z
G
1-.
Frtr,
F
L:
GF
a
arO
ae
z
z
c
z
GW
2
aGFE
F
zr
a
t£'1
r
w
r
w
wa
o^
w N.I
w
a
'r
t/7
F
O^G	 ^+ 2 0~0
N i+1 E w .F.1
^-+U	 FO
«aUWWC
< =)Ln Aw
-+OWHF+*
zzczwj
96W
U <F
4 ^ W O
s caU Cr O
a
c ►-
wUEa
m
a
w
F
0az0U
m
O
C
O
O ^
^ a
W U
U O:
a o
w° w
a
a w
v
W EW-'Er
w
G
z
wa
aE
F
zh
O
0
0
Wo
^ F WaaF
o¢
a .^
Oy
{^++ O tai"
^aa
W^
~ U t/
o aFe a
F tFl.'
sn W
W W U
W A
F C E
C
Uz F
z^
w Ea a
aaF:
^Wz
c
W
W C .a
a d > >
f
	 W W
e-i m U F
r-I ^ W Z
C W W r
aI
4H
V1
a Za^
w^
V
w
a
a
1
V
L:
C
va
a.
caFzC
zc
H
c.
w
z
c
W
a1CC
F
2
Z
aG
WUU
y-H
z^
a 0.
W
U w L 0.
a J
C
G
. a
• • H FN ZZ p
H U
W
xCU
40
ti i +1 FW s. L` Z V:
yGa. P- ^ ozq 6H W 6 ::p, «a F Y ^
tn
C G < Wcr w U. 0.t+e L+.. U	 I Z C C
W .aa CG F ^ G^ •'^W W	 O cn j0. C0" 0. 0. W Z Z cn Z .^' ^^	 a
U 6 W 0. U U '^ ... +-^ W 6 p- ,^	 ]
Y
Ura
aW
0a
z
c
c:
c
F
k=
c
U
n
Wa
r~
w
r
I
V
t7
^ Vi:.?	 Uf^.
F^6
W
C
W
Ca
0
WC
da
a
a
w
o	 °of.	 f-
t
#1
M..'A
	 - J.,i..w
If -
.r	 w
+k	 !
w	 w
t;w
9	 9
U
H
w
ilf
F^
aWF
^^ zN
^
^ H
h < W
a'
z
0
o•
kW. ac
F
a
U0
^^ U
U a U
2 u F 
►F-
U
?^ O O 3 Ut/1	 ^.] 9 cn
ON
W
0.	 UN
F	 i t
	 i l l	 G'-
r
H	
z a u
w	 •zcw^..
PC U k, r- V:
i
n.
DESIGN OF A MODULAR DIGITAL COMPUTER SYSTEM
DRL 5
ARMS ENGINEERING BREADBOARD
FABRICATION AND TEST REPORT
APRIL 25, 1960
Prepared under Contract NASS-27926
by
HUGHES AIRCRAFT COMPANY
FULLERTON, CALIFORNIA
for
GEORGE C. MARSHALL SPACE FLIGHT CENTER
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
FOREWORD
This report documents fabrication and test of the
ARMS Engineering Breadboard accomplished during the
fabrication and test phase of contract NASB-27926
from June 1975 through December 1979. This effort
was a follow-on to the architecture study and logic
design phases of this contract previously completed
and documented.
CONTENTS
I.	 SINGLE STRING
1. Partial Single String Fabrication
2. Single String Test
3. IOP Fabrication and Test
4. Single String Operation
II. COMPLETE SYSTEM BUILDUP
1. CPE and Memory Modules Replicated
2. Modules Installed In Cabinet
S. Memory Modules Tested
4. CPE Modules Tested
5. Duplex and TMR Operation
6. Reconfiguration
7. System Verification Remaining
6. Problems Encountered
ry
	
...._	 .........._	 ............
	 .
I - SINGLE STRING
1. Partial Single String Fabrication
A Partial Single String ARMS Engineering Breadboard (EB), as
dictated by incremental funding, was fabricated. The partial string
ARMS EB consisted of the following:
1	 Memory Module (MEN)
1	 Central Processing Element (CPE)
1 - Central Control Element (CC£)
1 - Maintenance/Status Panel and Electronics (MSPE)
These modules were assembled (IC's installed, etc.) on subassemblies
of the frames that would be installed in the cabinet at a later date.
Computer generated wiring programs were utilized to interconnect the
IC's with termi-point wiring and to also specify the subassembly inter-
connectonw. The modules were housed in a temporary test fixture for
the duration of the single string test.
2. SINGLE STRING TEST
The purpose of testing in a single string configuration was to ensure
logical and functional correctness of each module before the memory
and CPE modules were replicated. A partial single string test was also
compatible with funding limitations.
Testing commenced with verification of power distribution thruout the
single string and verification of panel functions necessary for CCE
testing. The entire CCE module and remaining panel functions were then
tested in minute detail so that the CCE would function for single string
testing and later for full system testing. Detail test progressed as
follows:
o Clock distribution internal to CCE and distribution to the
interface wiring for a complete system.
o Initialization operation of the module status control logic
was verified including module status logic for a complete system.
Stream page and buss assignments to the complete system were
verified.
o Operation of the Program Initiator and Reconfiguration Controller
was verified including all interrupt/response signals to the
complete system.
o Operation of the Fault Correlation Logic was verified. Each
fault interrupt was simulated and the proper response verified.
The CPE module was tested in minute detail for the reasons discussed
above and also so that it could be used as a reference later when other
CPE's were brought on line and data compared at redundant memory inter-
faces. Detailed test progressed as follows:
At	 . ,	
--
• Scanout Ver:Jicstion
• Master clear verification
• Micro program control operation
• Registers operation
• A',U operation
• Decail verification of each instruction in the instruction
set. Various short programs and other methods of inserting
data were used to exercise the various paths, options, etc.
thru the microcode for each instruction. ROM simulators were
used in place of the PROMs so that the stored microcode could
be readily chpnged.
The Memory Module was tested in minute detail for the same reasons as
the CPE Module discussed above. Detailed test progressed as follows:
o Timing & control operation
o Integration with core memory module
o Voter switch and output multiplexer logic verification
• Fault detection logic verification
• Hamming/Parity encoder and corrector verification
As a demonstration of the fault tolerant capability, a successful,
continuous read/write operation was executed with the core memory
module logic partially disabled.
3. IOP FABRICATION AND TEST
An IOP module was fabricated and added to the single string. The IOP
module was assembled on a subassembly of the same type as the other single
E
string modules and installed in the temporary test fixture. Computer
generated wiring programs were used to automatically wire the IC's.
Wire wrap wiring was the most cost effective method of Wiring at this
point in time.
The IQF !Module was tested in minute detail. Detailed test progressed
as follows:
o Common Control operation was verified. Handshaking with
reference to CAIN', CCW, (SW, CC, & IO interrupt was tested.
o TTY channe 7 operation was verified along with the Data Terminal
Controller operation. Data transfers and IO instructions were
executed to verify the TTY interface.
e Fault Detection logic was verified..
4. SINGLE STRING OPERATION
Single string operation, was verified by loading the TTY cassette with
a short program, transferring that program from the cassette to computer
memory and from memory out to the TTY printer, The program execution
verified all 10 instructions and other instructions of the SUFIC subset.
P
---- -M
COMPLETE SYSTEM BUILDUP
1. CPE and Memory Modules Replicated
PROMS for the three additional CPE modules were blown from the
updated control prom data. Updated computer generated wiring programs
were used to automatically wire the three additional modules of each
type, CPE and Memory. The wire wrap method of wiring was used. The
additional modules were assembled on the same type of subassemblies
as the single string modules.
2. Modules Installed In Cabinet
The replicated modules along with the original single string modules
and three additional core memory modules were installed in the ARMS
EB cabinet. The backplane was wired interconnecting all modules.
I
Power wadi connected to cabinet and power distribution tested.
i
i
3. Memory Modules Tested
All three memory modules were tested in a like fashion. The
internal fault detection logic was utilized to detect fabrication
errors (misplaced IC's, wiring errors, etc.). Short programs such
as the IOP test program were run and each successive module auto-
matically compared in duplex mode against the previous module at
redundant memory interfaces.
The three additional core memories were integrated with memory
4	
modules.
The programs run also verified operation of all memory modules with
the Ior.
4. CPE Modules Tested
The approach to CPE module testing was almost identical to that of
memory module testing. All three CPE modules were tested in a like
fashion. The internal fault detection logic was utilized to detect
fabrication errors (misplaced IC's, wiring errors, etc.). 	 Short
programs such as the IOP test program were run and each successive
module automatically compared in duplex mode against the previous
module at redundant memory interfaces.
5. Duplex and TMR Operation
Duplex and TMR operations were verified by setting up in the appropriate
configuration and running a short program that input the program from
the TTY cassette to computer memory, massaged some of the data and
output the program to the TTY printer.
6. Reconfiguration
Errors were inserted at the CPE and reconfiguration verified while
single clocking thru the operation.
Dynamic reconfiguration was observed at the panel scanouts when errors
of opportunity occurred.
,.*
7. System Verification Remaining
More exhaustive verification of the ARMS capabilities could be accomplishe,
by injecting a much larger quantity and more varied range of faults.
This fault injecting would be particularly effective if accompanied
by a more thorough diagnostics program.
S. Problems Encountered
No problems of a system concept nature were encountered. The detailed
logic tests and operational tests indicated that the system, performed
as conceived.
An area of checkout where many design problems were resolved was the
GPE microcode debug. Resolution of these problems was relatively easy
because the microcode was stored in ROM simulators which facilitated
correcting the code.
