Advanced information processing system for advanced launch system: Avionics architecture synthesis by Harper, Richard E. et al.
NASA Contractor Report-187554 
Advanced Information Processing System for 
Advanced Launch System: 
Avionics Architecture Synthesis 
Jaynarayan H. Lala 
Richard E. Harper 
Kenneth R. Jaskowiak 
Gene Rosch 
Linda S. Alger 
Andrei L. Schor 
THE CHARLES STARK DRAPER LABORATORY, INC. 
CAMBRIDGE, MA 02139 
Contract NAS1-18565 
September 1991 
NASA 
National Aeronautics and 
S pace Administration 
Langley Research Center 
Hampton, Virginia 23665-5225
	
( 44A-C -	 T55' )	 A 1VAC	 i -	 '\T I PROCESSING SYSTEM FOR ADVANCED LAUNCH 
SYSTEM: AVIONICS ARCHITECTURE SYNTHESIS (Oraper (Charles Stark) Lab.) 155 pCSCL 098 
-	
G3/62 
	
-u-- -	 •- - - -	 -.	 -
N92-11704 
Unci as 
0046446 
https://ntrs.nasa.gov/search.jsp?R=19920002486 2020-03-17T15:16:22+00:00Z
•RISHIAL CONTAINS 
NASA Contractor Report-187554 	
am IUMUTINS,
 
SO3Lfy 
Advanced Information Processing System for 
Advanced Launch System: 
Avionics Architecture Synthesis 
Jaynarayan H. Lala 
Richard E. Harper 
Kenneth R. Jaskowiak 
Gene Rosch 
Linda S. Alger 
Andrei L. Schor 
THE CHARLES STARK DRAPER LABORATORY, INC. 
CAMBRIDGE, MA 02139 
Contract NAS1-18565 
September 1991 
NASA 
National Aeronautics and 
Space Administration 
Langley Research Center 
Hampton, Virginia 23665-5225
TABLE OF CONTENTS 
Title	 Page 
Listof	 Illustrations .................................................................................... V
Listof Tables........................................................................................vii 
1.0 INTRODUCTION ...........................................................................1-1 
1.1 Design for Validation Methodology..................................................1-1 
1.2 Avionics Architecture Synthesis Overview.......................................... 1-5 
2.0 ADVANCED LAUNCH SYSTEM REQUIREMENTS.................................2-1 
2.1 Functional Requirements ..............................................................2-1 
2.2 Performance Requirements............................................................2-1 
2.2.1 Processing Requirements.....................................................2-1 
2.2.1.1 Advanced Launch System (ALS) Raw Requirements .........2-1 
2.2.1.2	 Derived Requirements ............................................2-12 
2.2.2 110 and Interfunction Communication Requirements..................... 2-15 
2.2.2.1	 Core FTPIJO ......................................................2-23 
2.2.2.2	 Core VIP - Propulsion FTP Communications................. 2-24 
2.2.2.3	 Propulsion VFPl/O ............................................... 2-25 
2.2.2.4	 Summary of I/O and Interfunction Communication ............... 
Requirements......................................................2-26 
2.3	 RMA and Environmental Requirements ............................................2-26 
2.3.1 Reliability	 and	 Availability .................................................... 2-26 
2.3.2 Maintainability ......................... . ...................................... 2-28 
2.3.3 Component Quality .......... ....
 ................................................ 2-29 2.3.4 Radiation	 Hardness ....... ......... ............ -31 
2.3.5 Power Dissipation ............................................................. 2-31 
2.4	 Requirements Conclusions ........................................................... 2-32 
3.0 AlPS ARCHITECTURE OVERVIEW....................................................3-1 
3.1	 Building	 Blocks.........................................................................3-1 
3.2	 Virtual Architecture ..................................................................... 3-2 
3.3	 Physical Architecture................................................................... 3-4 
3.4	 System Services ........................................................................ 3-7 
3-5	 Flight System Characteristics of Building Blocks.................................. 3-9 
3.5.1	 Functional Building Blocks .................................................. 3-9 
3.5.1.1	 Fault Tolerant Processor Channel................................ 3-9 
15.1.1.1	 Processor............................................3-10 
3.5.1.1.2	 Shared Devices ...................................... 3-11 
3.5.1.1.3	 ICIS and lOS Hardware ........................... 3-11 
3.5.1.1.4	 Communicator and lnterstage ..................... 3-15 
3.5.1.1.5	 FTC and 40 MHz Signal Determinism........... 3-19 
3.5.1.2	 Communications Node ........................................... 3-19 
3.5.1.2.1	 Node Controller..................................... 3-21 
3.5.1.2.2	 Communications Port Receiver/Transmitter ..... 3-22 
3.5.2	 Projected Performance Parameters ......................................... 3-23 
3.5.2.1	 Fault Tolerant Processor Channel............................... 3-23 
3.5.2.2	 Communications Node ........................................... 3-24 
3.5.2.3	 AlPS for ALS Avionics Packaging ............................. 3-24 
3.5.3	 Hardware Failure Rate Projections......................................... 3-24 
3.5.4	 Performance Projections of AlPS for ALS Building Blocks............ 3-32
111 
PRECEDING PAGE BLANK NOT FILMED 
LJLJNnIONA11v PAt4 
4.0 PRELIMINARY ALS AVIONICS ARCHITECTURE . 4-1 
4.1 Virtual Architecture ...................................................................... 4-1 
4.2 Physical Architecture .................................................................... 4.4 
4.3 Physical Characteristics................................................................ 4-6 
4.4 Reliability and Availability Projections............................................... 4-7 
4.5 Architecture Summary and Conclusions............................................4-10 
5.0 IMPACT OF AIPS/ALS ARCHITECTURE ON ALS COST..........................5-1 
5.1 Introduction ...................................... . ...................................... 5-1 
5.2 Problem Definition ...................................................................... 5-1 
5.2.1	 General Description ........................................................... 5-1 
5.2.2	 Contributors	 to	 Cost........................................................... 5-2 
5.2.3	 Architecture Definitions....................................................... 5-3 
5.2.3.1	 Architecture 1 .......................................................5-4 
5.2.3.2	 Architecture 2 ....................................................... 5-6 
5.3 The Cost Model ......................................................................... 5-6 
5.3.1	 System Parameters ............................................................5-7 
5.3.2	 Cost	 of	 System ................................................................. 5-7 
5.3.3	 Cost of Launch Weight .......................................................5-8 
5.3.4	 Cost	 of	 Unreliability .............................. . ............................. 5-8 
5.3.4.1	 On the Launch Pad ................................................ 5-10 
5.3.4.1.1	 Architecture	 1........................................5-11 
5.3.4.1.2	 Architecture	 2........................................ 5-13 
5.3.4.2	 During Launch and Flight ........................................ 5-21 
5.3.4.2.1	 Architecture	 1 ........................................ 5-21 
5.3.4.2.2	 Architecture	 2 ......................................... 5-23 
5.3.5	 Total	 Cost ............ ......................................................... 5-24 
5.4 Results..... ....................................................... ... ...................... 5-24 
5.5 Conclusions and Suggestons for Future Work ....................................5-29
6.0 SUMMARY AND CONCLUSIONS...................................................... 6-1 
7 .0 REFERENCES............................................................................... 7-1
 APPENDIX A HIGHEST LEVEL SPREADSHEET OF THE COST MODEL ......... A-i 
APPENDIX B MODEL REDUCTION TECHNIQUE FOR FTP ANALYSIS .......... B-i 
iv 
LIST OF ILLUSTRATIONS 
Figure	 Title	 Page 
1-1. ALPS Design for Validation Methodology............................................. 1-2 
2-1. ALS Avionics System Functions ....................................................... 2-2 
2-2. LASS (MM Coded) Benchmark Results ............................................... 2-13 
2-3. LASS (CSDL Coded) Benchmark Results ........................................... 2-13 
2-4. Functional Partitioning and I/O and Interfunction Communication Bandwidths 
forthe ALS...............................................................................2-27 
2-5. SEM-E	 Module........................................................................... 2-30 
3-1. AlPS Hardware Building Blocks ........................................................ 3-2 
3-2. AlPS Virtual Architecture................................................................ 3-3 
3-3. ALPS Processing Site Virtual Architecture............................................. 3-5 
3-4. ALPS Quad Redundant Fault Tolerant Processor: Fault Containment Regions....... 
and	 Interconnections...................................................................... 3-5 
3-5. ALPS Engineering Model Configuration............................................... 3-8 
3-6. Fault	 Tolerant	 Processor	 Channel...................................................... 3-10 
3-7. Network	 Interface	 Sequencer........................................................... 3-12 
3-8. IC and I/O Network Data Frame Format.............................................. 3-14 
3-9. Network Arbitration Timing ............................................................ 3-15 
3-10. FTP	 Communicator...................................................................... 3-17 
3-11. FTP	 Interstage ............................................................................ 3-18 
3-12. FTP	 Fault	 Tolerant	 Clock	 Topology................................................... 3-20 
3-13. FTC "Majority" Voter Operation ........................................................ 3-20 
3-14. Clock	 Corrections ......................................................................... 3-21 
3-15. Communications Node .................................................................... 3-22 
3 -16. Node Port................................................................................. 3-23 
3-17. AlPS	 for	 ALS	 Packaging	 Concept ..................................................... 3-25 
5-1. Architecture	 1.............................................................................. 5-5 
5-2. Architecture	 2 .............................................................................. 5-5 
5-3. Pad Markov Model for Architecture 1................................................. 5-15 
5-4. Triplex FTP Markov Model ............................................................ 5-16 
5-5. Reduced Triplex FTP Markov Model .................................................5-20 
5-6. Launch/Flight Markov Model for Architecture 1..................................... 5-22 
5-7. Baseline	 Results.......................................................................... 5-26 
5-8. Sensitivity to Component Quality...................................................... 5-28 
5-9. Sensitivity to Payload Value............................................................ 5-29 
5-10. Sensitivity to Time on Pad.............................................................. 5-31 
5-11. Sensitivity to Flight Time............................................................... 5-32 
5-12. Sensitivity to Repair Time .............................................................. 5-33 
B-i. Detailed Markov Model of a Triplex FTP .............................................. B-3 
B-2. Notation Convention ..................................................................... B-6 
B-3. Reduced Markov Model of a Triplex FTP........................................... B-10 
B-4. Markov Model ofa Dual FTP ......................................................... B-13 
B-S. Reduced Markov Model ofa Dual FTP ....................... . ...................... B-13
v 
LIST OF TABLES 
Table Title Page 
2-1. Central Control & Processing......................................................... 2-5 
2-2. Winds	 Ahead	 Determination...........................................................2-6 
2-3. Vehicle Power System Management .................................................2-7 
2-4. Steering	 &	 Staging	 Control............................................................ 2-8 
2-5. Sensor Processing......................................................................2-9 
2-6. Propulsion Control....................................................................2-10 
2-7. Command & Telemetry Processing .................................................2-11 
2-8. Range	 Safety	 &	 Destruct..............................................................2-11 
2-9. Programmable Payload I/F...........................................................2-12 
2-10. Central Control & Processing (Modified) ..........................................2-16 
2-11. Winds Ahead Determination (Modified)............................................2-17 
2-12. Vehicle Power System Management (Modified) ................................... 2-18 
2-13. Steering & Staging Control (Modified) ............................................. 2-19 
2-14. Sensor Processing (Modified) .......................................................2-20 
2-15. Propulsion	 Control	 (Modified) ....................................................... 2-21 
2-16. Command	 & TLM Processing 	 (Modified) ..........................................2-22 
2-17. Range Safety & Destruct (Modified) ................................................. 2-22 
248. Programmable	 Payload	 I/F	 (Modified) ...............................................2-23 
2-19. Aggregate 1/0 and Interfunction Communication Bandwidths for ALS .........2-26 
3-1. CPU	 Module	 Parts	 List	 (Ground	 Fixed)............................................ 3-28 
3-2. CPU Module Parts List (Missile Launch) ..........................................3-28 
3-3. Shared Devices Module Parts List (Ground Fixed) ...............................3-29 
3-4. Shared Devices Module Parts List (Missile Launch)..............................3-29 
3-5. COM/INT Module Parts List (Ground Fixed) .... . ................................ 3-29 
3-6. COM/INT Module Parts List (Missile Launch)....................................3-29 
3-7. NIS	 Module	 Parts	 List	 (Ground	 Fixed)............................................. 3-30 
3-8. ICIC/IOS Module Parts List (Missile Launch).....................................3-30 
3-9. CN Module Parts List (Ground Fixed) .............................................3-30 
3-10. CN Module Parts List (Missile Launch) ............................................. 3-30 
3-11. Channel and CN Failure Rates (Ground Fixed) ...................................3-31 
3-12. Channel and CN Failure Rates (Missile Launch/Boost Phase)................... 3-31 
3-13. Channel and CN Failure Rates (On-orbit)..........................................3-31 
3-14. AIPS/ALS FTP and Local System Services Performance ......................... 3-34 
3-15. AIPS/ALS I/O Network and I/O System Services Performance .....  ............ 3-35 
3-16. AIPS/ALS IC Network and IC Communication Services Performance ........3-35 
4-1. AIPS/ALS Fault Tolerant Processor Physical Characteristics..................... 4-6 
4-2. Availability and Reliability of ALS Avionics (Quad FTPs)........................ 4-9 
4-3. Availability and Reliability of ALS Avionics (Triplex FTPs) .....................4-9 
5-1. States of Architecture 1 Pad Markov Model........................................5-12
vu 
Iff-WAITICONALtY f4_04 PRECEDING PAGE BLANK NOT FILMED 
	5-2.	 Transition Rates for FTP Markov Model 	 .5-17 
	
5-2.	 Transition. Rates for FTP Markov Model (Cont.)..................................5-18 
	
5-3.	 Symbol Definitions for FTP Markov Model .......................................5-19 
	
5-4.	 States of Architecture 1 Launch Markov Model....................................5-23 
	
5-5.	 Initial Probabilities of Launch/Flight Markov Model for First Detected Failure 
Repair Strategy of Architecture 1 ..........................................  .......... 5-23 
	
5-6.	 Component Quality Factors ..........................................................5-27
ADVANCED INFORMATION PROCESSING SYSTEM

FOR
ADVANCED LAUNCH SYSTEM: 
AVIONICS ARCHITECTURE SYNTHESIS 
1.0 INTRODUCTION 
The goal of the Advanced Information Processing System (ALPS) is to achieve a 
validated fault tolerant distributed computer system architecture to meet the real time 
computational needs of advanced aerospace vehicles. One such vehicle is the Advanced 
Launch System (ALS) being developed jointly by the National Aeronautics and Space 
Administration and the Department of Defense to launch heavy payloads into low earth 
orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An 
avionics architecture that utilizes the AlPS hardware and software building blocks has been 
synthesized for ALS. This report describes the AlPS for ALS architecture synthesis 
process starting with the ALS mission requirements and ending with an analysis of the 
candidate ALS avionics architecture. 
1.1 Design for Validation Methodology 
The ALS architecture synthesis process follows a new design for validation 
methodology that has been developed as part of the AlPS program to assure that fault 
tolerant computer system architectures for advanced applications meet the reliability, 
performance and other goals of the application. The design methodology is described in 
detail and compared to the conventional avionics design methodology in an accompanying 
report [1]. It is recapitulated here briefly to put various steps of the ALS architecture 
synthesis process in context. 
In the AlPS design for validation methodology, a set of functional requirements is 
derived from the mission requirements and translated into avionics requirements, as shown 
in Figure 1-1. These avionics requirements are then mapped into hardware and software 
building blocks using a knowledgebase and future technology projections. Validation of 
the ALPS building blocks and generation of the knowledgebase and technology projections 
are goals of the AlPS program. The validation is being addressed using a combination of 
mathematical proofs, analytical models, and empirical test and evaluation. The architecture 
knowledgebase allows the designer to synthesize the architecture in accordance with rules 
and guidelines such that the fundamental principles of fault tolerance are adhered to and 
rationale for each design decision is related to overall mission requirements. The building 
block knowledgebase provides the designer with a detailed characterization of the 
performability and other important parameters of each building block. These 
characterizations, in conjunction with advanced technology projections, are used to project 
1-1
(Mission Scenario &
Operational Environment) 
System Functional Requirements (GN&C, Propulsion Control, 
Vehicle Health Monitor) 
Avionic& Requirements 
(Performance, RMA, Wt., Vol., 
Power, Cost)
Block 
Technology	 Synthesize Candidate 
Projections	 Avionics Architectures 
Analyze RMA, Performance, etc. 
for ALS Mission Scenario
RMA, Wt, Vol. 
AlPS Architecture 
Knowledgebase (Arch 
Rules, Guidelines, 
Attributes Graph) 
Finalize Architecture 
State-Of-The-Art 
Technology -
H/W & SIW Implementation 
(Brass Board, ASIC Designs) 
Implementation Validate Brass Board 
Iand ASIC Implementation
Building Block 
n Knowledgbase 
& S/W Requirer 
Design Details - 
AlPS Validation 
Knowledgebase(Proofs, 
Simulations, Test & 
Evaluation Results) 
Integrate with
Vehicle Subsystems 
Figure 14. AlPS Design for Validation Methodology 
1-2
building blocks implemented in state-of-the-art hardware and software technology. The 
AlPS hardware and software building blocks have been designed such that their major 
attributes such as Byzantine resilient fault tolerance, simplex programming model, 
reconfigurability, rigorous separation of redundancy management and applications 
software, etc. are not dependent on any specific technology of implementation. (The goal 
of the AlPS program is to validate these and other reliability and performance attributes of 
the AlPS hardware and software building blocks and make these building blocks available 
to programs like the ALS.) Furthermore, the system services are implemented such that 
their overheads become smaller as the processor and communication speeds increase. 
Therefore, the building blocks do not become obsolete with technology advancements. 
Their performance increases in direct proportion to improvements in processing and 
communication speeds. 
This design methodology makes the architecture synthesis task much less of an art 
and personal judgement and provides a solid foundation of knowledgebase on which to 
base design decisions. In the conventional design methodology, validation of the 
architectural characteristics such as redundancy management, reliability, maintainability, 
performance, etc. is done in parallel with the validation of the specific hardware and 
software implementation of the design. The design for validation methodology decouples 
the validation of the architecture design from the validation of a specific hardware and 
software implementation of that design. The combination of architectural rules and 
guidelines, the preválidated building blocks knowledgebase and the analytical models Of the 
performability of the synthesized architecture assure a validated architecture. A validated 
architecture is defined to be an architectural concept that when implemented in hardware 
and software will meet various mission requirements such as reliability, throughput, 
transport lag, cost, weight, volume, power, etc. 
• Section 2 of this report describes the ALS requirements. The very high level 
requirements were collectively obtained from the three prime contractors, Boeing, General 
Dynamics, and Martin Marietta, and provided to Draper by Martin Marietta. These include 
the general ALS mission scenario and related parameters such as time on the launch pad, 
launch availability, mission duration and reliability. Other ALS requirements include ALS 
computational functions such as Guidance, Control, Navigation, etc. The functional 
requirements were translated into detailed computational requirements such as throughput, 
memory, processing lag, function iteration rate, 1/0 and interfunction communication rates, 
etc., by Martin Marietta with some feedback from and interaction with Draper on the format 
and content. These "raw" computational requirements were then converted into "derived" 
requirements by Draper with Martin Marietta's assistance. The conversion was necessary 
to accurately reflect the overheads of the Ada language and compiler to be used in 
programming AlPS for ALS compared to the assembly languages that have traditionally 
been used to program the launch vehicle avionics. 
1-3
The AlPS architectural attributes, rules and guidelines, and reliability and 
performance models of the building blocks are described in detail in the accompanying 
report "Advanced Information Processing System: Design and Validation Knowledgebase" 
[1]. It also contains references to the more detailed hardware and software specifications 
and simulations that constitute the AlPS knowledgebase. This knowledgebase, which is 
quite large, is required to synthesize a validated ALS avionics architecture. Section 3 of 
this report briefly recapitulates the AlPS virtual and physical architectures and the key 
attributes of the hardware and software building blocks. 
The ALPS architecture and its attributes are transparent to the microelectronics 
technology. The hardware building blocks can be implemented in the state-of-the-art 
technology for improved performance and reliability while still retaining all of the validated 
characteristics of the ALPS architecture. Similarly, the software building blocks, i.e., the 
system services, can be implemented using the latest Ada run time system and compiler. In 
order to project the reliability and performance that can be expected of the ALPS building 
blocks if they were implemented using the technology that will be available in the ALS time 
frame, a task to survey the technology was undertaken. An accompanying report "Advance 
Information Processing System for Advanced Launch System: Hardware Technology 
Survey and Projections" [2] describes the results of this survey in detail. Section 3.6 of 
this report describes the flight system characteristics, including hardware implementation, 
performance projections, and module failure rate projections, of the AlPS for ALS building 
blocks.
The projected building block performability characteristics are then used to 
configure a candidate ALPS for ALS avionics architecture that meets the ALS performance 
and reliability requirements. This process of configuring the building blocks is explained 
in greater detail in the next subsection 1.2. The resulting ALS avionics architecture is 
described in Section 4 of this report. 
The reliability and performance of the candidate ALS avionics architecture have 
been modeled using the generic models of the ALPS building blocks developed earlier [1] 
but using the ALS specific parameters. The ALS specific parameters include the mission 
duration, launch pad time, repair strategy and other mission related parameters as well as 
the specifics of the ALPS for ALS building blocks such as the projected module failure 
rates, the fault detection, isolation and recovery times, various operating system service 
and redundancy management overheads, etc. Section 4.3 provides the details of the 
reliability and performance projections of the ALS avionics architecture. 
Further refinement of the architecture would be necessary if the projected 
performability does not meet the ALS requirements. After the configuration has been 
revised to meet the requirements, the next step in the ALS avionics design would be the 
detailed design of the hardware and software building blocks using the state-of-the-art 
microelectronics and software technology. This was outside the scope of the present study 
1-4
contract. Section 6 summarizes the results of this study with some thoughts on the steps 
necessary to reach a validated ALPS for ALS flight system hardware and software. 
One of the prime drivers of the Advanced Launch System design is the reduction in 
the cost of launching a pound of payload by an order of magnitude over the current launch 
vehicles. The ALS on-board computer systems have the potential to reduce the operational 
cost by automating a number of functions that are now performed manually by "standing 
armies" of ground controllers, by making the launch window less vulnerable to weather 
through the use of adaptive navigation and guidance algorithms, by reducing the cost of 
launch pad operations through "maintenance free" fault tolerant computers and by reducing 
the cost of launch vehicle failures through the use of fault tolerance techniques. In 
addition, the cost of the avionics themselves can be reduced by changing the current design 
philosophy of single-string, non-fault tolerant systems built out of the highest quality 
(Class S) components with additional quality control checks at every stage from 
manufacture to launch to a philosophy of fault tolerant systems that do not use the most 
expensive components but are actually more reliable than single string systems. Section 5 
of this report discusses in detail the impact of the AlPS for ALS architecture on the ALS 
cost. 
1.2 Avionics Architecture Synthesis Overview 
Architecture synthesis can be thought of as 
.
a constrained optimization problem. At 
the highest level the problem objective can be stated as: minimize cost subject to meeting all 
the avionics requirements. Cost studies, including the one described in Section 5 of this 
report, have shown that fault tolerant avionics constructed out of Class B components can 
be more cost effective for the ALS as a whole than single string avionics built out of Class 
S parts. Since the AlPS architecture provides prevalidated fault tolerant building blocks for 
ALS type applications, the architecture synthesis problem can be restated as: configure 
AlPS hardware and software building blocks to meet the ALS avionics requirements. 
As discussed in the preceding section on the design for validation methodology, 
AlPS for ALS configuration(s) are defined using as inputs the ALPS architectural rules, 
guidelines and attributes, the projected reliability, performance, physical characteristics and 
other attributes of the building blocks, and the ALS avionics requirements. The process of 
matching the avionics requirements with the building block capabilities is a 
multidimensional problem. However, it can be simplified by decomposing the 
requirements into two orthogonal sets each of which can be mapped independently of the 
other as a first order approximation and each of which determines a different aspect of the 
architecture. The performance related ALS requirements such as throughput, memory, 
transport lag, input/output latencies, etc. determine the virtual avionics architecture. The 
reliability related ALS requirements such as probability of mission success, launch 
availability, launch pad maintenance, function criticality, etc. determine the physical 
avionics architecture.
1-5
The virtual avionics architecture definition includes the number of Fault Tolerant 
Processors (FTPs), allocation of functions to FTPs, partitioning of functions between the 
Computational Processor (CP) and the I/O Processor (lOP) within each FTP, the number 
and type of sensors and actuators and their interconnections to FTPs via the I/O networks, 
etc. The physical avionics architecture definition includes such parameters as the 
redundancy level of FTPs, the redundancy levels of Inter-Computer and I/O networks, the 
physical topologies of networks, the redundancy level of sensors, actuators and other I/O 
devices, the cross-strapping of I/O devices to channels of FTPs and the redundancy levels 
of interfaces, etc. 
A preliminary virtual architecture can be defined by grouping and allocating functions 
to processing sites, grouping and allocating corresponding sensors and actuators to the 
same FTPs, and partitioning the functions in each site between the lOP and the CP. A 
subset of the AlPS System Services and modes of operation, such as scheduling of tasks, 
definition of I/O chains to acquire sensor data and send Out actuator commands, etc. can 
then be selected to complete the virtual architecture definition. Criteria for grouping 
functions in one processing site include functions requiring time critical or high 
communication rates, physical location (e.g. propulsion controller to be located on the 
engine or recoverable avionics to be located in a separate module), like-criticality functions, 
etc. Some of the constraints for grouping and allocating functions to one processing site 
include the maximum useful throughput available in an FTP and the FTP data exchange 
bandwidth which is necessary to perform interactive consistency and internal congruent 
distribution of all the sensors connected to that FTP at the iteration rates necessary to 
support all the functions executing in that FTP. Once a preliminary, allocation of functions 
has been completed, one can determine using the ADAS (Architecture Design and 
Assessment System) tool and simulations whether the performance criteria have been met. 
These include the transport lag for each function, inter-function communication latencies 
and rates, processor utilization and reserve throughput, etc. Functions can be reallocated, 
regrouped or number of processing sites added or deleted depending upon the results of 
performance modeling. The preliminary virtual architecture can be fine tuned using the 
results of the analytical models which are described in Section 4 of [1]. 
The process of defining, analyzing and fine tuning the physical architecture is similar 
to that for the virtual architecture; only the analysis tools and measures of merit are 
different. The analysis tools include the Markov models, combinatorial models, etc. The 
measures of merit include the probability of mission success, launch availability, 
probability of repair on the launch pad, etc. 
Preliminary AlPS for ALS virtual and physical avionics architectures are presented in 
Sections 4.1 and 4.2, respectively, of this report. A preliminary set of reliability and 
availability projections are described in Section 4.3. 
1-6
2.0 ADVANCED LAUNCH SYSTEM REQUIREMENTS 
As explained in the introductory section of this report, one of the inputs to the ALS 
architecture synthesis process is the set of ALS requirements. The very high level 
requirements were collectively obtained from the three prime contractors, Boeing, General 
Dynamics, and Martin Marietta, and provided to Draper by Martin Marietta. These include 
the general ALS mission scenario and related parameters such as time on the launch pad, 
launch availability, mission duration and reliability. Other ALS requirements include ALS 
computational functions such as Guidance, Control, Navigation, etc. The functional 
requirements were translated into detailed computational requirements such as throughput, 
memory, processing lag, function iteration rate, I/O and interfunction communication rates, 
etc., by Martin Marietta with some feedback from and interaction with Draper on the format 
of the contents. These "raw" computational requirements were then converted into 
"derived" requirements by Draper with Martin Marietta's assistance. The conversion was 
necessary to accurately reflect the overheads of the Ada language and compiler to be used in 
programming AlPS for ALS compared to the assembly languages that have traditionally 
been used to program the launch vehicle avionics. 
Section 2.1 describes the ALS functional requirements. The functional 
requirements were translated into processing requirements, described in Section 2.2.1, and 
I/O and interfunction commtrnication requirements, described in Section 2.2.2. Section 2.3 
captures the other ALS requirements such as reliability, maintainability, availability and 
operating environment. 	 .	 .	 . .	 . 
2.1 Functional Requirements 
Nine top-level ALS functions have been identified by Martin Marietta: Central 
Control and Processing, Winds Ahead Determination, Vehicle Power System Management, 
Steering and Staging Control, Sensor Processing, Propulsion Control, Command and 
Telemetry Processing, Range Safety and Destruct, and Programmable Payload Interface. 
Figure 2-1 shows the top level requirements along with the breakdown of each function 
into its next level component sub-functions, for all except the Programmable Payload 
Interface. The function hierarchy is three levels deep, with functions at the third level 
assumed to be equivalent to executable, dispatchable tasks. 
2.2 Performance Requirements 
2.2.1 Processing Requirements 
2.2.1.1 Advanced Launch System (ALS) Raw Requirements 
The ALS computational and interfunction communication requirements were 
provided to CSDL in the format of two HypercardTM documents: one describes processing 
requirements and the other describes interfunction communication requirements [3, 4]. The 
most recent such requirements received by CSDL were dated December 7, 1989. 
2-1
Figure 2.1. ALS Avionics System Functions 
The ALS processing requirements are expressed by Martin Marietta in a 
hierarchical, dataflow-like representation. Currently, the function hierarchy is three levels 
deep, with functions at the third level assumed to be equivalent to executable, dispatchable 
tasks. Nine top-level functions have been identified. Computational requirements are 
given at multiple layers of the hierarchy, with the requirements of a higher-level function 
consisting of aggregate summations of the numerical requirements of its lower-level con-
stituent functions. These aggregates have been observed to give an upper bound on the 
throughput requirements. To perform an accurate and meaningful synthesis of the avionics 
system, the following requirements are needed for each dispatchable task (level three in the 
representation hierarchy): 
Frame rate 
Throughput (or instructions per execution) 
2-2
Throughput margin 
Processing lag 
Scheduling requirements (e.g., preemptible or nonpreemptible) 
Task execution order dependencies 
Inter-function communication requirements (bits per iteration, latency) 
Given this data it is possible to construct a distributed schedule for the task suite 
and quantitatively perform system sizing and determine performance parameters. The 
current set of requirements does not yet possess this level of detail, so several simplifying 
assumptions are made to allow a rough system sizing. First, it is assumed that all tasks are 
preemptible within their frame by higher-frequency tasks, as long as the preempted tasks 
complete their execution within that frame. This allows the throughput utilization of a 
processor to be determined by summing the throughput requirements of the tasks it hosts. 
The second assumption is that the processing lag requirement will be met if the processor 
possesses the throughput to execute the requisite number of instructions of a task iteration 
within the processing lag. It is realized that these two assumptions may be mutually 
inconsistent since a task which is preempted within its frame may not meet its processing 
lag requirement. However, reconciliation of this potential inconsistency must be deferred 
until the detailed requirements are available. 
CSDL has transferred the available numbers from the Hypercard documents to 
spreadsheets to facilitate their manipulation. The spreadsheets are presented as Tables 2-1 
to 2-9 and are interpreted as follows. 
Column A: Name and Number of Task(NOte 1) 
Column B: Frame Rate F(N0 1) 
Column C: Throughput TO(Note 1) 
Column D: Margin M(NO 1) 
Column E: Processing Lag L(NO 1) 
Column F: Margined Throughput TM = TO( 1 +M)(NOte 2) 
Column F Row 4: Total Margined Throughput Required = m 'i(Noe 3) 
Column G: Instructions per Execution IE = TM/F 
Column H: Instantaneous Throughput IT = IE/L. 
Column H Row 4: 
Maximum Instantaneous Throughput Required = max(IT). 
Each task is identified by a name and a hierarchical number (Column A). The number 
indicates the place of the task or function in the three-level hierarchy of functional 
2-3
requirements. For example, Kalman Filter (1.2.4) is Task 4 of the NavIIMU (1.2) 
function which is sub-function 2 of the Central Control & Processing (1) function. 
The Frame Rate (Column B) is the iteration rate in Hertz of the task. For example, 
Kalman Filter task must execute at 25 Hz. 
The Throughput TO (Column C) is the throughput, measured in instructions per 
second, required to perform a task. For example, Kalman Filter task is estimated to require 
3.2 million instructions per second throughput. 
The margin M (Column D) is the additional throughput requirement for a task. The 
margine is provided for the uncertainty in estimating the throughput or for growth 
purposes. The margin for Kalman Filter task, for example, is 0.5. The total throughput 
required for this task is TO(1+M) or 4.8 MIPS. This is called the Margined Throughput 
TM and is shown in Column F. The sum of all the margined throughputs for the tasks 
under a function is shown for each function in Column F, Row 4 of each table. For 
example, the total throughput for Central Control and Processing function is 21.8 MIPS, as 
shown in Table 2-1. 
The maximum allowable processing lag for each task is shown in Column E. This is 
the interval from the time a task needs its inputs to the time it produces its outputs. The 
maximum allowable processing lag for Kalman Filter task, for example, is 36 milliseconds. 
The number of instructions executed by a task per iteration is shown in Column G and 
is obtained by dividing the margined throughput TM by the frame rate F. For example, 
Kalman Filter task executes 210,000 instructions every iteration. The instantaneous 
throughput IT is the processing throughput required to execute a task within the requisite 
processing lag and is shown in Column H. For Kalman Filter task, this number is 5.83 
MIPS. 
Column H, Row 4 of every table indicates the maximum instantaneous throughput 
required for that function. For example, the Central Control and Processing function 
requires 10.72 MIPS to be able to perform the system identification task (1.4.3) within the 
required 10 msec processing lag. This number represents a lower bound on the throughput 
required of the processor 
Notes: 
1. From Martin Marietta ALS Specs. 
2. Changed from TM = TO/(1-M) to prevent blowup at M=1. 
3. Assuming preemptive scheduling. 
Also, note that boldface italicized numbers represent CSDL guesses. Italicized 
numbers represent Martin Marietta-supplied aggregates and are not used in calculations 
unless otherwise noted.
2-4
A I	 B	 I	 C	 I D E F G H 
I I	 Frame I	 Thruput I Mar g in Proc. Margined ins/Exec Instant 
2 Rate	 (Hz)I	 (IPS)	 I Lag (ms) Throughput  'hruput	 (IPS 
3 Central Control and Processing I  Required
 Required 
4 Exec	 171	 I  sum: 21 .829.396 max: 10.722.625 
5 Central	 Control	 1.1.1  I	 1001	 30,000 0.5 8 45.000 450 56.250 ........ 
7 Timing & Sequence 1.1.2  
i	 i aol	 30,000 0.2 0.8 36.000 360 450.000 
9 1PS	 Self-Test	 1.1.3  7 0 1	 11	 40,0001 1 200 80,000 80,000 400,000 
1 T IPS Fault Detect & Mgmt 1.1.4  
251	 50,000 1 10 100,000 4,000 400,000 
Subsystem	 Status	 Monitor	 1.1.5 ____________ 
251	 38,000	 0.5 
_1 10 57,000 2,280 228,000 Mission Phase Sequence Gen 1.1.6 _ 
R17
_1 
 I	 1001	 5,000 0.2 5 6.000 60 12.000 
Data	 Reporting	 1.1.7
50 20,000 1 1 40,000 800 800,000 
Nav/IMU	 1.2
IMU	 Processing	 1.2.1  
.2,.1. I	 1001	 3.200.000 0.2 10 3.840.000 38.400 3.840.000 22 GPS	 Processing 1.2.2  I	 1001	 3,200,000 0.2 10 3.840.000 38.400 3.840,000 
Error	 Com pensation	 1.2. 
I	 3001	 25.000 0.5 1 37.500 125 125.000 
1
(alman	 FIltering	 1.2.4  
51 I	 2	 3.500.000 0.5 36 5.250.000 210.000 5.833.333 
Nay Exec 1.2.5  I	 100 1.000 1 6 2.000 20 3,333 
30 FDI	 1.2.6 
T 1
	
50 132,000 =0
 
 0.5 198.000 3.960 7,920,000 TT Bending	 Processing	 1.2.7 ___ _____ __________ _________ TT 1001	 1.200 
_j -----6-2 . 0.1 1,440 14 144,0016  34 Atve Gdnce 1.3  TV Two-body Linear Guidance 1.3.1  T j	 501	 37,500(0.2 15 45.000 900 60,000 3 7 Ji-Linear	 Traj	 Shaping	 1.3.2TV I	 ii	 3,300,000 0.2 960 3,960,000 3,960,000 _ 4,125,000 TV Contingency	 Control	 1.33  TT I	 I	 1	 1,000 1 1 2,000 2,000 2,000,000 TT Adaptive	 Control	 1.4  
42 Model	 Reference Adaptive Controlle r	 1.4.1
TY I	 501	 505.100 0.25 10 631.375 12.628 ________ 1.262.750 
44 Classical	 Autopilot	 1.4.2 __ 
 I	 501	 542.350 0.5 10 525 16.271 1.627.050 
46 System	 identification	 1.4.3 EE __ 47 I	 251	 2.144.5251 	 0.25 10 2,680.656 107.226 10.722.625 
. 48 System RM 1.5 (includes CSDL RM task thr uqh puts ap roximated as 50KIPS/3) 
49 Fault	 Response	 1.5.1 
_0 I	 2 51	 17.000	 1 40 34.000 1.360
_ 
34.000 
5 1 Fault	 Determine	 &	 Isolate	 1.5.2
j	 251	 17.000( 1 40 34,000
_________  
1,360 34.000 
Configuration	 Manager	 1. .3  
j	 251	 17,000 1 40 34,000
_________  
1,360 34.000 
1
GenDlscretes 1.6  
Validate Commands 1.6.1
_________  
________ 
I	 501	 4,200 1 2 8,400 168
__ 
84,000 
5 8 Update Outputs	 1.6.2 ________
VT I	 0I	 4,000 1 2 80.00
TV
8,000  
Discrete	 Error	 Handling	 1
,0001
.6.3 _______
i I	 1	 I	 1 1 0.5 4.000,062
EE
Det G&C Wnd Dltas 1.7 (no task t roug h puts available)
63 I	 I	 I	 29.000 I0.5 20 43.500 
64 Load	 Relief	 Al gorithm	 1.7.1 ________ __________  
i I	 1001	 01	 0 10 0 i 01 0 6 6 Estimate	 Fluctuation	 Stats	 1.7.2167: I	 1001	 01	 0 10 0 0 __ 0
Table 2-1. Central Control & Processing 
A B C D E F G H 
1 Frame Thruput Mar g in Proc. Margined Ins/Exec Instant 
2 Rate	 (Hz) I	 (IPS) _Lag (ms) Throughput  hruput	 (IPS) 
3
- 
Winds Ahead Determination 2  Required  Required 
4 _______ sum: 670.800 max: 656,842 
5 1 1	 520,000 1 0.5 500  
6
- 
Manage Measurement Resources 2.1  
I	 ii	 1001 i 1 
8 Lidar Cal and Checkout 2.1.1  1 1	 ii	 0 1 1 0 0 0 
'1 i	 ar	 BITE	 2.1.2  
i I	 251	 6,000 0.5 1 9,000 360 360,000 
2 Lidar	 Fault	 Handling	 2.1.3 _______  
3 I	 251	 0 0 1 0 0 0 
4 Lidar	 Health	 Monitoring	 2. .4  
.1 .i I	 501	 3.000 0.5 1 4.500 90 90.000 
1 6 Compute Wind	 Profile 2.2  ill I 
1 8 Lidar mode	 Control 2.2.1  
1 1 I	 ii	 0 1 1 0 0 0 
2 0 Winds	 Measurement	 Fitting	 2.2.2  
2.i. I	 ii	 520,0001	 0.2 950 624.000 624.0001 656.842 
2 2 Vibration	 Compensation	 2.2 .3 _ 
i I	 501	 5,0001 0.5 1 7,500 150 150,000 
2 4 Bending Compensation 2.2.4
 I	 501	 10,000 1 1	 5 20,000 400 80,000 
26 Control	 Velocimeter	 2.3  
IL I	 100 I	 2.000 1 1 4.000  
2 8 Redundant Lidar Configuration 	 Contr )1	 2.3.1  
i9 I	 1001	 0 0 1 0 0 0 
3 0. Lldat	 Power	 Control	 2.3.2  
j_ I	 251	 0 0 40 0 .0 0 
32 Receive and Process Winds Info 2.4  I	 501	 800 1 1 
3 4 Detection	 2.4.1  
I!. J	 ii	 0 1 1 0 0 0 
36 Pulse	 Deconvotution	 2.4.2  
._z. I	 ii	 0 1 1 0 0 0 
3 8 Range	 Determination	 2.4.3  
ii I	 ii	 0 1 1 0 0 0 
40 Doppler	 Frequency	 Estimation	 2.4.4  
T J	 ii	 01	 1 1 0 0 0 
T2 Data Collection	 &	 Formatting  
.
4
_2. I	 .	 11	 6001	 1 10 1,200 1,200 120,000 
44 Check Range W/ Exp Values 2.4.6 _______  3 5 I	 ii	 5001	 0.2 51 600 600 120,000
Table 2-2. Winds Ahead Determination 
B C  F II 
1 Frame Thruput Margin Proc. Margined Ins/Exec instant 
2 Rate (Hz) (IPS) _________ Lag (ma) Throughput  Thruput (IPS) 
7 eh	 e Power System Manage Vid	 : ient 3  Required
 Required 
 sum: 6.900 max: 138,000 
5 - 25	 4.600 0.5 2 6.900 276 138,000 
6 Battery Charge Mgmt 3.1 ___________  
z _i 1 
8 Charge Mgmt Exec 3.1.1 
1 1 1 
troiCharg in 	 3.1.2  
1	 0 1 1 0 0 0 
12 Monitor Charge 3.1.3  
I_ii_0 1 1 0 0 0 
Errpr_DetectionStatus_Deteri _ Inatlon_2 
I _0 1 1 0 0 0 
16 Pwr System Load Error Det3.2. 
17 I_ii_0 1 1 0 0
_ 
0 
PwrEmergencyMgr 3.2.2
 
I_11_0 1 1 0 0 C) 
ZU I Load Shedding  
213.31 10 1 1 0 0 0 
IC L080 Distribution 3.4  
I_11_0 1 1 0 0 C" 
24 PwrSystemLoadControl 3.4.1  
25 Iii 0 1 1 0 0 0 
2
-
mission
	 3.4.2
1 11	 0 1 1 0 0
_ 
8 ChangeLoadDistribution3.4.3 ________ _________  
zu I_ii 1 1 0 0 0 
30 Power-Up Initialization3.4.4 
_ j__ i_0 1 i	 1 i	 0 0 0 
.3... MopitorLoad &_Predict_Deøieh' _	 _ 3-45  33
_._ I_11_0 11 11 0 0 0
Table 2-3. Vehicle Power System Management 
2-7 
Al B  F G	 I H 
1 Frame Thruput Marg in Proc. Margined Ins/Exec	 I Instant 
2 Rate (Hz) (1PS) Lag (ms) Throughput  Thruput (IPS) 
Required  Required 
T
___________  
Steering & Staging Control 4 sum: 103.502 max:. 1,800,000 
5	 1 I	 501	 55,000 0.5 15 82.500 1,650 110,000 
6	 1 Provide Steering Signals 41  1 I	 ii	 0 0 1 0 0 0 
8 Identify Changed Cards 4.1.1  
-- I	 ii	 Ul 0 1 0 0 0 
TO Corn Cmd to Prev & Limit Rate 4.1.2  
n ii	 01  1  
12 1 Gen New Steering Signal 4L1.3  
13: I	 1,	 0 0 1 0 0 0 
14 Report Steering Cmd Erro 4.1.4  
11 I	 ii	 0 0 1 0 0. 0 
iT Validate Steering Cmd Retonse 4.1 5 ___________  
11 I	 ii	 0 0 1 0 0 0 
18 Select Outputs and Verify mds 4.2  _________  
'1 J	 50 1 	 5,000 1 4 10,000 200 50,000 
'20 ' Validate Dlscretes 4.2.1  
—ffi: I	 0.41	 1,000 1 4 2,000 5,000 1,250,000 
22 Activate Staging 4.3  
23:: I	 ii	 0 0 1 0 0 0 
24 LFU Control & Monitor 4.3. 1	 ii	 0 0 1 0 0 0 
Verify & Sequence Power .3.2  
27 1	 I	 11 1 0 1 1 1 1.000 
28 1 Ordnance BIT 4.3.3 
29 1	 I	 ii	 6.000 0.5 51 9.000 9.000 1,800,000 
30 1 Fault handling 4.3.4  
11 1	 I	 ii	 i  i  1 1,000 
32 1 LFU Redundancy Control 4.3.5  
.- 33 1	 I	 ii	 01 0 1• 0 0 0 
4 ActivateDiscreteDevices .4  
35:: I_1, 0 0 1 0 0 0 
36 1 ProvideDiscrete Outputs 4.1  
7 I_11_0 0 1 0 0 0 
5 RCS Control4.5  1 1 0 0 1 0  0 
40 1 RCSError Generate 4.5.1  
_i_ I_• 1I_0 0 1  0 0 
42 PhasePlaneControl4.5.2  
_43 1	 1	 fl	 01 0 1 0 01 0 
44 RCSCommandGenerator 4.5.3 _______  
35 1 I_11_01_0 1 0 0 0
Table 2-4. Steering & Staging Control 
2-8 
-	 B	 C.	 DI	 E H 
1	 -	 Frame	 Thruput	 Marvin I	 Proc.	 Mar9ined	 ns/Exec	 Instant 
• T•	 Rate (Hz)	 iPS)	 Ea ag (msthroughput 	 hruput (IPS) T	 Sensor Processi g 5	 ___ 	 Required i 	 1	 _________	 SUM:	 2640.651	 Max:	 2,030,000 5
	
25	 636.000	 0.5	 34  
6	 1 Prescale & Catec' rize Sensor Data 5.1  1	 1	 25	 12.500	 0.5	 34  
8	 Identify Format 5	 .1  
:	 -1 	 50	 130.000	 0.5	 40	 195000	 3900	 97.500 
10	 Filter & Store 5.1.  
11 	 J	 50 	 812,000	 1	 16	 1624000	 32,480	 2,030,000 
12 1 Format-A Convei	 & Store 5.1.  
15	 J	 25	 5,000 	 40	 10000	 400	 10,000 T Format-B Convei & Store 5.1.  
:15 1F	 1 	 1	 1  
16 IType Processing 5.2  
'17 1	 1
	
50	 90,000	 1	 5	 180000	 3,600	 720,000 
18 	 Compare With Lh Ats 5.2.1  
1	 0 	 1 	 0  
2O	 Averag ing & Vot iq 5.2.2 
T	 1	 1 ____ __	 1    Count Limits Errors 5.2.3  
1______ _______  
1  	 1  
4•Count Limits Errors 5.2.4  
::	 1	 •0 	 1	 0  
26 JSeftt Redunda, y 5.2.5  
27	 1	 1	 0 	 1 	 0  
28
	
Cailkmtlan A Ya dation5.3  
29	 25	 112,500	 0.5	 5	 168750&750	 1,350,000 
Q.	 Determine Ciltic 'ity 5i.1  	 --  
31	 1	 1	 .0	 .0	 1	 0	 0	 0 
32	 Manage Recover 5.3.2  
4	 Sensor Fault Toli rance 5.4.0  
3'	 1 . 75	 308,500	 0.5	 5	 462750	 6j70	 1,234,000 
36 1 Limit Checks 5.4. _________ ______ _______ __________  
-.7 	 1	 0	 0	 1	 0	 0	 0 
W Consistency Ch 	 ks 5.4.2	 _ _______  
391	
1 
4' Validation Error	 eport 5.4.2  _____	
0 	 1    
41
	
1 1 	 i 1  
42 1 Error & Failure R portIng 5.5 1 	 _______  
43 1	 j	 1 1	 100 1	 0.5 1	 51	 150 1	 150 1	 30,000
Table 2-5. Sensor Processing 
2-9 
B C D E F G	 I H 
1 Frame Thruput Margin I Proc. Margined Ins/Exec instant 
2
-
Rate (Hz) (IPS) Laa (ms) Throughput 1hruput (IPS) 
3
- 
Propulsion Control 6  Required  
-
1
SUM: 20,000,000 Max: 26,666,667 
5 - 50 110,000,000 1 15 20,000,000 400,000 26,666,667 
Manage Propulsion and Fault Tolerance 6.1  
_.z._ I	 11	 0	 0 1 0 0 8	 1 Prop Cntrlir Timing & Exec 6.1.1  
. 9 .. I	 1 	 0 0 1 0 0 o 
10 Prop Memory Mgmt 6.1.2  
•11 I	 11	 0 0 1 0 0 0 
12 Prop Fault Mgmt 6.1.3 
:13 i	 11	 0 0 1	 1 0 0 0 
14
	 1 Update Data Tables & Report 6.2 
5 I	 11 	 0 0 . 
16 1 Reoort En gine Status 6.2.1  
17 I	 11	 0 0 1 0 0 0 
1 ' Update Table 6.2.2  
19 1	 11	 0 0 1	 1 0 0 0 
Process Commands 6.3  
:i: I	 11	 0 0 1  0 0 
22 Monitor and Verify Commands 6.3.1  
23:: I	 ii	 0 0 1 0 0 0 
24 Compute Valve Position 6.4  
25 I	 11	 0  1 0 0 0 
26 1 Engine Control Loops 6.4.1  
I	 11	 0 0 1 0 0 0 
28 Update Actuator Positions 6.4.2 ______ _______ ___________ _________  
I	 11	 0	 0 1 0 0 0 ..,
30 Compare Actuator Model 6.4.3  
•'31 I	 11	 0	 0 1 0 0 0 T 1 Process Propulsion Sensors 6.5  
33 1	 I	 11	 0 0 1	 1 0 0 0 
T 1 Sensor Data Scaling 6.5.1  
11	 0 0 1 0 0 0 
r 1 Redun Sensor Process and Ouai 6.5.2  
az_ 1	 I 11	 01	 0 1	 1 0 0 0 
38 Limit Monitor & Failure Detect 6.5.3  
39:: I	 1 	 0 	 0 1 1	 0 0 0 
40 Sensor Failure Handler 6.5.4 
41: I	 11	 01	 0 1 1	 Ol 0 1	 0
Table 2-6. Propulsion Control 
2-10 
B C D E F	 .1	 G I	 H 
1 Frame Thruput Margin I	 Proc. MarginedI ns/Exec I	 instant 
2
-
Rate (Hz) (IPS) Lag (ms) rhrouqhput 	Thruput (IPS) 
3
-
 Required  
4
_ - 
Command & TLMProcessing 7 SUM: 525.000 max: 525.000 
5 I	 25 I 350,000	 I	 0.5 40 525,000 21,000 525,000 Decode & Process Command Data 7.1  
I	 11	 01	 0 1 0 0 0 T Decode & Verify Commands 7.1.1  1 
	
Cl	 0 1 0 0 0 
1 5 Validate Ground Data 7.1.2  
:11 ._J 11	 0	 0 1 0 0 0 ir 1 Command Acceptance 7.1.3  
:13 1	 i	 1 	 0 0 1 0 0 0 
14 TLM Table Manager 7.2  
:15 t	 ii	 0 0 1 0 0 0 
16 Format Telemetry 7.2.1  
ji_ 1	 I	 1 	 0 0 1 0 0 0 18 TLM Transit 7.2.2  
:i: i	 11	 0 0 1 0 0 0 20 Control Telem	 Format 7.3.1  etry 
21:: I	 11	 0 0 1	 1 0 0 0 
Table 2-7. Command & Telemetry Processing 
- A 
-
 U _____ (i II 
1 Frame Thruput MargIn III Margined Ins/Exec
Iruput 
Instant 
• T Rate (Hz) (IP) Lag (ms) Thruput (IPS) 
____________ ______ Required  4 Range Safety & Destruct 8  SUM: 5,400 max: 1,800,000 
T
1
	 1	 3,600 0.5 3 5400 51,400 1 800 000 
B _R1êcèTvè& DecódeSl9nafS.1.t---------------------------------------
 1 
0 0 0 
7111 Self-Test & timing 3.1 .2 
-- - 11 --a
T2 
W
transmit C-Band Signal 8.1.3 
-------------------1--- 
Comn ndReceive	 becode8.  
 
_ _____ 
iT 0 0 1 0 a l. Oestructtàmmand Logicø..t IT
.1.—-.-
ii	 0 
SiilikfPoSöbràe822
1	 0
0 
-O
1 
1
0 
-	 .
0 
F ie Destruct
	
8.23 -
21 11 .0 1 0
-. 0 27 
24
Destruct System Control .3 
.1	 0
 
Safe & Arm ISDS 8.3.1  
....
. .
2E ............ Verity Loop Integrity 8.3.2
0 
r 
29
ISDS Redundancy Manager8.3.3 
1	 I	 1	 1
_______  
ô Initiate Destruct 8.3.4  
Table 2-8. Range Safety & Destruct 
2-11 
- A B G 0 E F H 
Frame Thruput Margin Proc. Margined ins/Exec I	 instant 
Rate (Hz) (IPS) Lag (ms) Thrupt Thruput (1PS) 
-r  Required  
T Programmable Payload i/F 9 . •. 
- 
j	 1j	 0 1
Table 2-9. Programmable Payload IN 
2.2.1.2 Derived Requirements 
The ALS throughput requirements discussed in the previous section were derived 
by Martin Marietta using certain assumptions that can have a substantial impact on the ALS 
avionics architecture and the number of processors and other hardware required to meet the 
performance requirements. In particular, the overheads for using Ada appeared too high in 
comparison with the CSDL experience with the new Ada compilers and Run Time 
Systems. As the Ada compilers have matured over the last few years, they have become 
much more efficient in generating code. Our recent experience indicates that newer Ada 
compilers generate code that is almost as good in code density and execution time as other 
high level language compilers such as C [22]. This appears to be true for a wide variety of 
programs including computationally intensive programs and operating system oriented 
programs. 
In order to quantify the overheads of using Ada compared to an assembly language, 
and compare this overhead with the factor of 6 assumed by MM, a typical ALS application 
algorithm, the Lateral Acceleration Sub-System filter function (LASS), was chosen as a 
benchmark. This function had already been coded by MM in Ada. It had also been coded 
by CSDL in Ada using the CASE tool. Both versions were compiled with the Verdix 5.5 
cross compiler and the XD Ada cross compiler produced by Systems Designers Software, 
Inc. and Digital Equipment Corporation. They were compiled with and without Ada 
constraint/range checking, thus producing a total of 8 versions. The Verdix 5.5 compiler is 
currently being used by CSDL on AlPS and a number of other projects. The XD Ada 
compiler is a newer, much more efficient compiler that has been undergoing Beta testing at 
CSDL for the past few months. 
The eight versions of the LASS filter were run on the AlPS Fault Tolerant 
Processor (FTP) that was built for NASA Johnson Space Center (JSC) to obtain 
comparative execution times. The object codes of the eight versions were also analyzed to 
produce instruction counts. Figures 2-2 and 2-3 summarize the number of instructions and 
the execution times for the eight versions of the LASS filter function. Ideally, the 
execution time should be compared to the execution time of LASS function coded in an 
assembly language to determine the overhead of using Ada. However, the MM-supplied 
2-12 
requirements do not contain execution times of functions. They consist of an estimate of 
the number of instructions for each function. These estimates were arrived at by examining 
the function flow charts and using the following assumptions: 
1. Time for 1 add = time for 2 instructions. 
2. Time for 1 multiply = time for 4 instructions. 
3. Time for 1 divide = time for 8 instructions. 
4. Time for high order language (Ada) overhead (addressing, pragrnas, etc.) = 
sum instructions for adds, multiplies and divides (#1-i-#2+#3) and multiply sum 
by 6. 
Since the goal of the benchmarking was to calibrate the MM-supplied requirements 
(rather than to compare Ada to an assembly language on an absolute basis), we used the 
MM assumptions to compute the number of "assembly language" instructions for the LASS 
function. The total number of instructions, using the flow charts and the assumptions 1, 2, 
and 3, was calculated to be 74. This was multiplied by 6 (assumption #4) to arrive at 444. 
This number can now be compared to the instruction counts in Figures 2-2 and 2-3. 
H/W: AlPS Engineering Model JSC FTP 68020/68881/15.7 MHz 
Compiler Num of Instr per her Time per Iter 
Verdix5.5 w checks 415 1.09 msec. 
Verdix 5.5 w/o checks 341 0.85 msec. 
XD Ada w checks 160 0.60 msec. 
XD Ada w/o checks 115 0.43 msec. 
Figure 2-2. LASS (MM Coded) Benchmark Results 
H/W: AlPS Engineering Model JSC FTP 68020/68881/15.7 MHz 
Compiler Num of Instr per Iter Time per Iter 
Verdix 5.5 w checks 277 1.22 msec. 
Verdix 5.5 w/o checks 247 1.14 msec. 
XD Ada w check 109 0.54 msec. 
XD Ada w/o checks 99 0.51 msec.
Figure 2-3. LASS (CSDL Coded) Benchmark Results 
2-13 
The following conclusions can be drawn from these benchmarks: 
1. The XI) Ada compiler consistently produces fewer instructions than the Verdix 
5.5. compiler. The ratio varies from 2.5 to 3 depending on whether or not range checks 
are turned on. 
2. The code produced by the XD Ada compiler consistently outperforms the Verdix 
produced code. The ratio approximately varies from 1.8 to 2.2. The code density and 
speed ratios between the two compilers are consistent with other benchmarks performed by 
CSDL.
3. The manually coded MM version produces more instructions than the CASE but 
generally executes faster. The CASE produced version passes many parameters with each 
iteration and the instruction used to push a parameter onto the stack apparently takes a 
longer time than the average instruction. Other reasons for this disparity have not been 
analyzed at this time. 
4. Turning the range checks on or off has a much greater impact both in the 
number of instructions as well as execution time on the manually coded version than on the 
CASE produced version. This is due to the fact that CASE uses predefined functions for 
certain floating point operations which are not affected by turning the checks on or off. 
Given the superior performance of the XI) Ada compiler which represents state of 
the art in Ada compiler technology, it will be assumed here that the ALS will either utilize 
this compiler or an advanced Ada compiler of similar performance. The following 
conclusions are based on this assumption. 
5. XI) Ada compiler generates 99 to 160 instructions for the LASS filter compared 
to 74 counted in the flowchart, resulting in an overhead factor of 1.34 to 2.16. (As pointed 
out earlier, the number of instructions would be 444 using the MM overhead factor of 6 
which would have been fairly representative of the earlier Ada compilers.) The ALS 
throughput requirements have been recomputed in the next section using an Ada overhead 
factor of 2.2 as mutually agreed to by MM and CSDL. 
6. Martin Marietta used a further multiplicative factor of 2.3 to account for the 
"logical and environmental overhead" of scheduling and dispatching a task. This overhead 
is actually not dependent on the number of instructions in a given task. Instead, the 
overhead is a fixed number of instructions required for each iteration of the task, i.e. each 
context switch. The number of instructions for scheduling and dispatching a periodic task 
was measured using the Verdix 5.5 compiler and run time system (RTS). A periodic task 
dispatch takes 820 microseconds and was equal to 830 instructions. The instructions used 
for the context switch take less time (less than 1 microsecond per instruction) than typical 
computation intensive instructions. These numbers are currently not available for the XI) 
2-14
Ada compiler. However, based on the code produced by the two compilers and the fact 
that the XD Ada run time system is written in an assembly language while the Verdix run 
time system is written in Ada, our estimate is that approximately 500 extra instructions 
would be required for each iteration of each task. MM feels that there is still some 
overhead that is proportional to task size and a proportional factor of 1.5 in combination 
with a fixed overhead of 250 instructions per task per iteration should be used. The ALS 
requirements have been recomputed in the next section using these two factors. 
A new set of ALS throughput requirements have been derived using the findings 
reported in the previous section. Using an Ada overhead of 2.2 rather than 6, and a fixed 
scheduling overhead of 250 instructions per task per iteration in combination with a 
proportional overhead of 1.5 rather than a factor of 2.3, the new ALS requirements are 
summarized in Tables 2-10 to 2-18. The following conclusions can be drawn from the 
modified requirements. 
The reduced overheads of an advanced Ada compiler make the modified throughput 
requirements obviously much smaller than the initial requirements provided by MM. 
However, the effect of changing the scheduling overhead from one that is proportional to 
the task size to one that is dependent on the task iteration rate is not so obvious. The 
dispatch overhead for high frequency tasks is quite high: for example 25,000 instructions 
per second for a 100 Hz task. On the other hand, the requirements for large tasks are 
substantially reduced: for example, the Kalman filter function requirement (including 
margins) is reduced from 3.5 MIPS to .84 MIPS. 
The net effect of these changes is that the overall ALS throughput requirements 
(including margins) for non-propulsion functions are reduced from about 26 MIPS to about 
8.8 MIPS. Furthermore, the maximum instantaneous throughput (throughput required to 
perform the most demanding indivisible task) for non-propulsion functions is reduced from 
about 10.7 MIPS to about 3 MIPS. For propulsion control functions, the total throughput 
requirement reduces from 20 MIPS to about 4.8 MIPS and the maximum instantaneous 
throughput requirement reduces from 26.7 MIPS to about 6.4 MIPS. 
The derived requirements were used to synthesize the ALS avionics architecture as 
discussed in Section 4. 
2.2.2 I/O and Interfunction Communication Requirements 
To obtain estimates of the Input/Output and Interfunction communication require-
ments, the ALS requirements Hypercard stacks provided by Martin Marietta [3, 4] were 
examined. The communication requirements contained in these documents are less 
complete than the computational requirements discussed above. For functions for which 
relevant communication requirements were absent from the Hypercard stacks but which 
were known to possess significant communications needs, other preliminary requirements 
documents were consulted [5, 61.
2-15
A B	 I	 C	 (_D El F G	 I H 
1 Frame IMM Thruputlod Thrupu iarg id Proc. Marg ined I Ins/Exec I	 Instant 
2 Rate	 (Hz A 	 (UPS)	 ( (UPS) Lag	 (msThrounhout Thruput	 OP.' 3 Central Control and Processin g 1  Required  Required 
4 Exec	 1.1	 I  sum: 5,841.958 max: 3,036,730 
5 Central	 Control	 1.1.1 _____  
T j	 1001	 30,000
___ 
32,174 0.5 8 48,261
__________ 
483
__________ 
60.326 
7 Timing & Sequence 1.1.2 ___T 100	 30,000 32,174 0.2 0.8 38,609 386 _______ 482,609 
9_ UPS	 Self-Test	 1.1.3  
0 ii	 40,0001 9.815 1 200 19.630 19.630 98.152 
1 UPS Fault Detect & 1.1.4  MOO I	 251	 50.000 18.207 1 10 36.413 1.457 145,652 
3 ISubsystern Status	 Monitor	 1.1.5  
T 1 I	 251	 38,000r- 15,337 0.5 10 23,005 920 92,022 V IMIssion Phase Sequence Gen 1.1.6 ___ 
iT 1001
	
5,000 26,196 0.2'----T 31,435 314 62,870 ¶7' Data	 Reporting	 1.1.7  
1
 I	 501 20,000 17,283 1 1 34,565 691 691.304 
1 9 Nav/1MU	 1.2  
20 IMU	 Processing	 1.2.1  
21 I	 1001	 3.200.000 790.217 0.2 10, 948.261 9.483 948.261 
22 IGPS Processin g 1.2.2  
23 1
	
1001	 3,200.000 790,217 0.2 10 948,261 9.483 948,261 
24 error	 Compensation	 1.2.3 ____ __________ 
T 1
	
3001	 25,000 80,978 0.5 1 121,467 405 404,891 
26 Kalman	 Filtering	 1.2.4 __________ _____ ______  
I	 251	 3.500.000 843,207 0.5 36 1,264,810 50,592 1,405,344 
28 Nay Exec 1.2.5  
29 I	 1001 1,000 25,239 1 6 50,478 505 84,130 
30 FDI	 1.2.6  
1 I 	 501 132.000 44,065 0.5 0.5 66.098 1.322 2,643.913 
_ 32 Bending	 Processing	 1.2.7 
T j	 1001	 1,2801 0.2 0.1 30,367 304 3,036,73 T Aptve Gdnce 1.3  
T 'fW-body	 Linear Guidance	 1.3.1 __ ____ _____  I I	 501	 37,500J	 21,467 0.2 15 25,761 515 34,348 37 Non-Linear	 Trai	 Shaping	 1.3.2  j I	 11	 3,300,000 789.380 0.2 960 947.257 947,257 986.726 
39 Contingency	 Control	 1.3.3  
j,9 I	 1	 I	 1.000 489 1 1 978 978 978.261 4 1 AdaptiveControl	 1.4 _____  
TT Model	 Reference Adaptive Controller .4.1  
WT j	 501	 505,100 133,285 0.25 10 166.606 3,332 333,212 TT Classical	 Autopilot	 1.4.2 
TV I	 501	 542,350 142,192 0.5 10 213,2891 4,266 426,577 TV System	 identification	 1.4.3 
LL I	 251	 2.144.5251	 519.071 0.251 10 648,8391 25,9541 2.595.356 4 8 lSvstem RM 1.5 (includes CSDL RM; task throu g hputs approximated as 5OKIPS/3) 
49 Fault	 Response 1.5.1  
so I	 251	 17.000 10,315 1 40 20.630 825 20.630 
51 Fault	 Determine	 &	 isolate	 1.5.2 _________ __ 
VT 1
	
251	 17,0001 10,315
_____  
1 40 20.630 825
______ 
20.630 
53 Configuration	 Manager	 1. .3 _____  
VT I	 251	 17,000 10,315 1 40 20,630 825 20,630 
VT Gen Discretes 1.6  
56 Validate Commands 1.6.1  
I	 501	 4.200 13,504 1 2 27,009 540 270,087 
58 'Jpdate Outputs	 1.6.2  
, 9. I	 sot	 4.000 13.457 1 2 26,913 538 269,130 
60 Discrete	 Error	 Handling	 1.6.3  
VT j	 II	 1,0001	 489 1 0.5 978 978 1,956,522 VT Oct G&C Wnd vitas 1.7 (no task throughputs avaiiable;  
VT I	 1 I
_
	
29.0001	 7,185 0.5 20 10,777 10,777  
VT Load	 Relief	 lgorithm	 1.7.1 
I	 1001	 01	 25,000 0 10 25,000 250 25.000 
66 Estimate	 Fluctuation	 Stats	 1.7.2  
67
.
I	 iool	 01	 25,000 0 101 25.0001 2501 25,000
Table 2-10. Central Control & Processing (Modified) 
2-16 
I B	 I	 C D E F C H 
Frame	 1M Thru put Mod Thrupu Mar g in Proc. Margined ins/Exec instant 
2
-
Rate	 (Hz)I(IPS) (UPS) .ag	 (ms Throughput  Thruput	 (IPS 
) W1?ds Ahead Determination 2
______
 
RequIred
 Required 
 sum' 348.823 max 507.652 
_5_ 1 520,000 1 124,598 0.5 500  
6	 IMenace, Measurement Resources 2.1  
. 100 274 1 1 
8 CTr Cal and Checkout 2.1.1  
9 I	 1 I	 0 250 1 1 500 5001 500.000 
0 Lidar	 BITE	 2.1.2  
1 1	 251	 6,000 7,685 0.5 1 11,527 461 461,087 
T Lider	 Fault	 Handling	 2.1.3  I I	 251	 0 6,250 1	 0 1 6.250 250 250.000 
4 Lidar	 Health	 Monitoring	 2- .4  
iT I	 501	 3.000 13,217 0.5 1 19,826 397 396,522 
compute Wind Profile 2.2  
¶ 7 I	  
I 8 IJdar mode Control 2.2.1
I I	 I I	 0 250 1 1 500 500 500.000 
2 Winds	 Measurement Fittlna	 2.2.2  
I	 I	 520,000	 124,598 0.2 950 149,517 149,517 157,387 
2 2 
L
Vibration	 Compensation	 2.2.3  
Z I	 501	 5,000 13,696 0.5 1 20.543 411 410.870 
2 4 Bending Compensation 2.2.  
T I	 501	 10,000 14,891 1 5 29,783 596 119,130 T iiitrol	 Velocimeter	 2.3  
Li. I	 100 I	 2,000 25,478 1 1 50.957 510  
28 Redundant Lidar Confi guration Control 3.1  
29 I	 1001	 0 25,000 0 10 25.000 250 25.000 30 1-10t	 Power	 Control	 2.3.2  
I	 251	 01 6.250 0 40 6,250 250 6.250 
32 Receive and Process Winds info 2.4  
33 j	 53 1
	
800 12,691 1 1 25,383 508 507,652 
T4 biiiction	 2.4.1  
! I	 I	 0 -250 1 1 500 500 500,000 
36 Pulse	 Deconvolutlon	 2.4.2 - T' j	 ii	 0 250 1 1 500 500 500,000 TT Iiq.	 Determination	 2.4.3 - 
TT 1	 1 1	 Ui 250 1 1 500 500 500,000 
40 Doppler	 Frequency
	
Estimation 2.4.4  jj_ I	 i	 01	 250 1 1 500 500 500.000 
42 Data	 Collection &	 Formatting 2.4.5 __  
i I	 1 I	 6001	 393 1 10 787 787 78,696 
1 44 Check Range WI Exp Values 2.4.6 _______ 
1 45 I	 11	 5001 0.2 51 01 0 0
Table 2-11. Winds Ahead Determination (Modified) 
2-17 
-r 
•7
B 
Frame 
ate (Hz)
C	 I	 D 
MM Thru	 Mod Thru 
lPS)	 p	 (IPS)
-MANIC in
F 
Proc. 
Lag	 s) (m
____ Margined 
Throughpu
H	 '	 I 
lflS/Xec I	 Instant 
 Thruput (IPS) 
Vehicle Power	 ystem management  Required  Required 
'
sum: 18.025 max: 500.000 
.1. I	 25 I	 t600 7.350 0.5 2 11.025 441 220.500 
6 Battery Charge Mgmt 3.1 
I	 ii 250 1 500 500 500000 1 Charge Mgmt Exec 3.1.1  
W I	 ii 250 1 500 500 500000 1 Control Charging 3.1.2 __________  
IT I	 ii 250 1 500 500  500.000 
T Monitor Charge 3.1.3  
1 I	 11 250  1 500 500 500.000 
IT Error Detection & Status Determinati ii 3.2  3 I	 11	 01	 250  1 500 500 500 000  
16 Pwr System Load Error Det 3.2.1  
IT
11	 CI	 250  1 500 500 500.000  1W _______  mergency Mg r 3.2.2 
1W ii	 C1 250 1 500 500 500.000 _______  
 SHedding 3.3 
ii 0 250  1 500 500 500,000 
TF mic Load Distribution 3.4
End
23 1	 ii	 01	 250 _____ 1  500 500 500.000 
4T
_______ ___
System Load Control 3.4.1
2ii CI	 250 1 500 500
26
_______ ___ 
ion Power Profile 3.4.2
7ii CI	 250 1 'I 500 500 500.000 
2W ge Load Distribution 3.4.3  
.W I	 ii
	
01 250 1 1 500 500 500,000 
'alT Power-Up initialization 3.4.4  
T 1	 I	 ii	 Cl	 250 1 i	 1 i	 500 500 500.000 
32 1 Monitor Load & Predict Deplete 3.4.5 
331 I	 11	 01	 250 11 11 5001 5001 500.000
Table 2-12. Vehicle Power System Management (Modified) 
2-18 
-	 • -	 -	 ______ 
L!L H .'nmcI rinni WI W:1r1(iEi IIJ
	 IIi	 [I	 Ji!I1 	 WON 
	
 
kLII	 LKFAJ	 _1Icr 
. I zrrirrrijrmnrt 
I	 I	 I	 J.ii	 .XoI.I 
'Z'Ht [4 tTrT 
-	
I	 TI	 I	 I1 1.KiI.I

R.F C. T.rIrIJ7Tl iinm.nriwu 
an mI	 IMEEMMMO
mmma. 
Ku 0
I	 S	 .T1'I	 I	 1.1I41,J,I,I
ti_I k'I1I-irrTI.TI; 
I somompad 	 -._I_ .TI_.1I14III 
100191 uld 
IFt1T1fl'1-i lii ILThJEZJ 
Ill •	 I	 r!I1	 EckJ	 1	 1	 g7i_ f1 fc*i.y 
t!I I'n it: rii 
IJ	 S :1c1I	 I MEMER7.1 morr.r. I 	 kLik 
W'4 V([fl(1IJj1jLc 
I	 S	 4]	 Si	 I	 ________ 
LII! a1I.r1Vz1Tht:7 
wiI •	 S	 1I	 SI	 I	 kfil	 kf i1.iIiIi 
.EI I!Li'7I4. I[i . [.i:I .l''LL 
tia -	 I	 Si J [.J7iaiir.ii:I k 9 E,  
tJ	 1!.1	 - 
1!J I iiiiIU1tc — 
II -	 I	 I•i	 SI	 I	 ___________
tJ I lall tfTITii iI. L'Thr7E !c!. 
_I	 I	 S	 Si	 I	 1.101	 1.T'1isr.i r!n 1:1 'HTfl11 'ri'irz 
L	 I	 S	 f.J	 SI	 I MOMMU411A SOMMIA ON1911MILIA 
wii I rirri	 i.i'irpw 
YA -	 I	 S	 M!iII	 I_1i1_1.191	 j'I.I 
jJ I;[*1s1l7E!. 
ci •	 •	 1.1iJ	 I	 1 i;t'±i Rii.rLeTi IITt1(EL ! LWI 
Lki•	 I	 I	 II	 I ff&A Lk iTTT1 11iHflL 
LiU	 I	 S1.1SI	 '1	 I 
	
'I	 •1	 i
• j I ;flrtiJi c"-IMIIth,IYFr I flj	 flTiI flii•I.FAflI . 111111.11 ii 
j— tJ— ____ 
i IIjIJJ'I	 .1iii! L-
U— •- i.1ci4JPcr.T1 
S.L!J I II III I	 •il . Ill	 I. ii.fljflJ 1. tq!II 
IL.I I I I 
[4h'I	 .T. -1rTL1' 
—
___ 
._II--_I	 £1 •1	 I 
Li!1 
LkI J__UlIuIUU 
1r. M's iIlrrI II.,. 
mm 
W&NRK-7M, ii I	 ii. I-
*iiu.uuiij__ I_1_jI. 
LJ J1 OEM
 
4I—	 4I I_IUIU!1 Z 
J t'flhl	 irrnri.*• J— 
IcI!I I ,r1I.mIi'11c. 1 'IIIiIuIIIIILI. .'J1IIIilIIIFXi' 
!1Luii,-i __ tc1cI_ a	 .1IIIIIIIIIIIIIIN. 1 
Ill— jjI  
Lii 
I1__uI_1_uI_J L
______ 
a 
J _	 1I	 mTI.-% LCI I	 .	 _-1
___
Table 2-14. Sensor Processing (Modified) 
2-20 
. :11 tw. ;b l I i1 t	 wj! - 117 ITT. TTh —
• 
.0 IJfl1 InrT.nTT1 I 
ILif]It I [.J7•1fl.E
ONEEMIA 4.TI A 11 
c-I-, 
It
____  
____ _1xII 
IL-I 
I	 - IJIflLIItl[1W 
I 
rr - II I-I-Ill ]iil,,I it	 -I' 
rr
- Iu[1itl(&1it 1(11
- I 
rruni
I .11 1 I4.lMXJI 
L - IIaI1'1 ]i1	 ]I LL. _
-MXII 
tt.ItJTvrnnilJminjrI' 
WAA-
_________ 
_J.11rr 
I!ninrLrL arrri
-MXII 
•.i-iJrn rriIi-iirii-r!. 
1d4 (Ii II-1iI-	 ]i1j I.14-I-1 lit L 11 11PW
4 All 
M XII 
'41 All
Table 2-15. Propulsion Control (Modified) 
2-21--
Al B I	 C D El	 F G H 
1 Frame IMM	 Thruput Mod Thrupu MargI,	 Proc. Margined ns/Exec instant 
2 Rate	 (Hz (IPS) (IPS) Lag (m rhroughpu  Thruput I 1 _________ ____ _____ Required ______ (1PS) .... 
4 Command & TLM Processing 7  SUM: 134,918 max: 250,000 
5 I	 25 I	 350,0007	 89.946 0.5 40 134.918 5.397 1134.918 
6 Decode & Process Command Data 7.1  
i	 1	 I	 0	 I	 250 0 1 250 250 250.000 
8 Decode & Verify Commands 7.1.1 
-: i	 1	 1	 0	 I	 250 0 1 250 250 250,000 0 Validate Ground Data 7.1.2  I I	 1	 I	 0 I	 250 0 1 250 250 1250,000 2 Command Acceptance 7.1.3  
ii I	 1	 I	 0	 250 0 1 250 250 250,000 
4 TLM Table Manager 7.  1 1--77	 0	 250 0 1 250 250 250,000 6 Format	 Telemetry	 7.2.1  
1. I	 I	 I	 0 250 0 1 2501 250 250.000 8 TLM	 Transit	 7.2.2  j .1 I	 i	 I	 0 250 0 1 250 250 250,000 
2 0 Control	 Telemetry	 Format	 7.3.1  J I	 i	 I	 0	 I	 250 0 1 250 250 250,000 
Table 2-16. Command & TLM Processing (Modified) 
B C D  G H 
1 Frame 1M	 Thruput Mod Thrupu Margined ns/Exec Instant 
2
-
Rate	 (Hz) (UPS) (UPS)  Thruput  Thruput	 (lPS
3
-
WIn
Required
T
________ 
4ije Safety & DOstruct 8  1,666  max: 555,435 
I 1	 3.600 1.111 1666 1666 555435 6 Band Transpond 8.1 
..7_ 1 1	 0 250 0 1 250 250 250,000 8 Receive & Decode Signal 8.1. 
1	 1	 I	 0 250 0 1 250 250 250.000 
1 0 Self-Test	 &	 timing	 8.1.2 _________ _____  
-
iT 1 250 0 1 250 250 250,000 
1 2
--	 - 
Transmit C-Band	 SinaU  
ii 1	 0 250 0 1 250 250 250,000 
1 4 Command Receive & Decode 8.2 
15 - 1	 1
	
0 250 0 1 250 250 -250,000 
1 Destruct Command  Logic 8.2.
-
i7 1 1
	
0 250 0 1 250 250 250,000 
1 8 Select Power Source 8.2.2  
ii 1I	 I	 0 250 0 1 250 250 2508000 20
-- 
Seguece Destruct Signals 8.2 ,3  
- 
1	 1	 I	 0 250 0 1 250 250 250000 
22 IDestruct System Control 8.3 _______ 
231 1	 1	 0 250 0 1 250 250 so,öOö 
2 4 1 Safe & Arm USD5 8.3.1
...! 'l	 1 1	 0 250 0-250 -25O 250,000 26 Verity	 Loop	 lnteqrity
	
8.3.2  
27 
F
1	 1	 0 250 0 1 250 250 250;000 F Y__edundancy Manager .3  
29 
3 0
1	 1	 1	 -	 -	 0 
Initiate	 Destruct	 8.3.4
250 
________
-0 1 
_______
250 
________
250 501000 
_L1I	 -- 0 250
___ 
0 1-	 1 1	 2501 250J_250
Table 2-17. Range Safety & Destruct (Modified) 
2-22 
1IiEIat	 • .I.gj.IIitI.:IiTM —t. 1iIiJ]i1 
U 
U 1	 '1 NOW'1 14s	 I.
Table 2-18. Programmable Payload VF (Modified) 
Based on the throughput estimates discussed above it was concluded that all ALS 
functions except propulsion control can be performed by one FFP. (See Section 4.1 for a 
more detailed discussion). For brevity we denote this FTP the "Core FTP." According to 
[5] the propulsion controllers (also assumed to be FTPs) are assumed to reside at the 
engines; denote these as the "Propulsion FTPs." This functional partitioning guided the 
interpretation of the communication requirements. Specifically, in analyzing the 
communication requirements only the communications between the Core VIP and the 
vehicle sensors and actuators, the Core FTP and the Propulsion VFPs, and the Propulsion 
FFPs and the propulsion sensors and actuators were considered. Moreover, functions 
which did not possess communications requirements or had numerically insignificant 
requirements were excluded from this analysis. 
Determination of temporal load profiles requires detailed knowledge of task and 
communication request scheduling, and this information is currently unavailable. 
Therefore the figures obtained from the requirements are at best average figures and 
primarily of use only in performing a rough sizing of the communications media, and to 
predict the average utilization of the media. 
Using the assumptions and simplifications noted above, the following 
communication requirements were determined. 
2.2.2.1 Core FTP I/O 
This category comprises input from the vehicle's sensors to the Core FTP and 
output from the Core VIP to the vehicle's actuators. This I/O would take place over one 
or more AlPS I/O Networks or through memory-mapped I/O devices resident on the Core 
VrP's Shared or Private Bus. 
Winds Information: "The winds information signal represents the echo returns 
from the electromagnetic probe signals emanated from the winds measurement signal. It 
consists of sixty points, obtained every sixth of a second [4]." Each sample is 16 bits. 
The average bandwidth required for this I/O function is thus 
60 points/sample* 16 bits/point*60 samples/sec = 56,470 bits/sec 
Sensor Signals: This communication path consists of "...2500 non flight critical 
sensors and 600 flight critical sensors, or 3100 sensors. We assume most are sampled at 
50 Hz rate [4]." Each sample is 16 bits. We assume that the non flight critical sensors are 
2-23 
simplex and the flight critical sensors are duplex (fail-operational), resulting in 2500 + 
1200 = 3700 sensor reads per iteration. The resultant average bandwidth required is 
3700 points/sample * 16 bits/point * 50 samples/sec = 2,960,000 bits/sec 
Ground Downlink: "The ground downlink contains the telemetry stream which 
is normally routed through the radio link [4]." The average bandwidth required for this 
function is given directly by the requirements to be 
8,400,000 bits/sec 
Steering Signals: "Two actuator signals of 16 bits each are sent to up to 17 
engines...(at a) 25 Hz rate [4]." The bandwidth required for this communication is 
2 signals/engine * 17 engines * 16 bits/signals * 25 outputs/sec = 13,600 bits/sec 
The aggregate average ALS Core FTP I/O bandwidth requirement is 11,160,340 
bits/sec. 
2.2.2.2 Core FTP - Propulsion FTP Communications 
This category comprises communication between the Core FTP and the Propulsion 
FTP(s). This communication would most likely take place over the InterComputer Net-. 
work.
Propulsion Commands: "The propulsion interface is fairly simple. Command 
examples: Engine Preparation, Engine Start, Engine BIT, Engine Shutdown. Assume 
there (are) at most 16 commands (resulting an a 4-bit command word) [4]." The output 
rate is 50 Hz. The bandwidth required for this function is 
1 command/engine * 17 engines * 4 bits/command * 50 outputs/sec = 3,400 bits/sec 
Propulsion Status: This signal is not described in [4] but is referenced in an 
earlier informal requirements document and corroborated via communication with Martin 
Marietta. It represents the propulsion control transmitting propulsion status to the 
command and telemetry functions, and is of sufficient magnitude to be considered here. 
This signal requires a bandwidth of
10,000,000 bits/sec 
Total Engine Status: The propulsion control periodically reports the engine 
status to the Central Control and Telemetry fun- ctions [4]. Each engine possesses 100 bits 
of status information and the output is scheduled at a 25 Hz rate, resulting in a bandwidth 
Of
2-24
100 bits/engine * 17 engines * 25 outputs/sec = 42,500 bits/sec 
It is not known whether the Total Engine Status signal supplants the Propulsion Status sig-
nal described above. 
Propulsion Memory Dump: The propulsion control function can be 
commanded to dump its memory contents to the Command and Telemetry processing. 
This is normally done only at prelaunch. The bandwidth requirements are given as 
16,000,000 bits/sec 
The aggregate average bandwidth required for communication between the ALS 
Core FTP and the ALS Propulsion FTP(s) is 10,045,900 bits/sec during flight and 
26,045,900 bits/sec during prelaunch propulsion memory dumping. 
2.2.2.3 Propulsion FTP I/O 
The ALS Propulsion FTPs possess interfaces to the engine sensors and actuators. 
This communication would take place over one or more regionally partitioned I/O Net-
works, or through memory-mapped I/O devices resident on the FTPs' Shared and/or Pri-
vate Bus. 
Propulsion Sensor Signals: "There are 357 flight critical sensors, reporting at 
25 and 50 Hz rates, with up to 16 bits accuracy [4]." We assume that this sensor com-
plement suffices for 17 engines. While it is suggested in [4] that the worst-case of 50 Hz 
and 16 bits be used for all sensors, further conversations with Martin Marietta suggest that 
approximately 20 sensors are sampled at a 1000 Hz rate. Moreover, since propulsion 
sensors are flight critical, they are at least duplex, resulting in a total number of sensor 
reads of 337*2 = 674 per 50 Hz sample and 20*2=40 per 1000 Hz sample. The total 
bandwidth required is 
674 points/sample * 16 bits/point * 50 samples/sec

+
40 points/sample * 16 bits/point * 1000 samples/sec = 1,179,200 bits/sec 
Propulsion Actuator Controls: The propulsion control transmits actuator 
commands to the engines via this communication path. The requirements are obscure but 
appear to comprise 4 to 5 fail-operational/fail-operational (triplex) actuators per engine, 
each of which is provided with a 16-bit output value at an assumed 100 Hz rate. We 
assume that each engine therefore requires 15 outputs. Under these assumptions the 
bandwidth requirement is 
15 actuators/engine * 17 engines * 16 bits/point * 100 samples/sec = 408,000 bits/sec 
2-25
The aggregate average ALS Propulsion FTP I/O bandwidth requirement is 
1,587,200 bits/sec. 
2.2.2.4 Summary of I/O and Interfunction Communication Requirements 
The aggregate bandwidth requirements for the three ALS categories are depicted in 
Tables 2-19. Figure 2-4 depicts a functional partitioning of the ALS virtual architecture 
along with the I/O and Interfunction communication bandwidth requirements. 
Category	 Bandwidth. bits/sec 
Core FTPI/O	 11,160,340 
Core FTP - Propulsion FTP Communications 
prelaunch	 26,045,900 
flight	 10,045,900 
Propulsion VIP I/O
	
1,587,200 
Table 2-19. Aggregate I/O and Interfunction Communication Bandwidths 
for ALS 
2.3. RMA and Environmental Requirements 
2.3.1.	 Reliability and Availability 
The ALS mission scenario and Avionics Reliability and Availability requirements 
are summarized in the following paragraphs. 
The ALS mission scenario comprises four phases, each of which has different 
RMA parameters. These phases are: (1) integration and checkout in a facility such as a 
Vehicle Assembly Building, (2) a period of extended launch pad residence, (3) the launch 
or the boost phase, and (4) on-orbit operation. 
During vehicle integration the ALS avionics must be verified to be in a nonfaulty 
state prior to roll out to the launch pad. This implies a degree of testability which has not 
been quantified or modeled in the current study but is expected to be high because of the 
characteristically high diagnosability of the AlPS Byzantine resilient approach to fault 
tolerance. 
The ALS remains on the pad in a condition of launch readiness for a period of up to 
one week, during which no repair or maintenance of the avionics is desirable to avoid costs 
associated with such actions. A "Launch with Faults" policy is assumed to be in effect 
such that the ALS can be launched with avionics system faults, but only if the avionics 
system is known to be capable of fault masking at the time of launch. The availability 
requirement used for the current study is that the ALS avionics system must have a 95% 
2-26
Winds	 Sensor	 Steering Ground 
	
Information Signals 	 Signals Downlink 
(0.056)	 (2.96)	 (0.014)	 (8.40) 
Core 
FrP 
Propulsion	 Total	 Propulsion Propulsion 
	
Commands Engine
	 Status	 Memory 
(0.003)	 Status	 (10.0)	 Dump 
	
(0.043)	 (16.00) 
(Prelaunch) 
Propulsion
FFP(s) 
Propulsion J	 Propulsion 
Sensor [	 Actuator Signals 7	 Controls (1.179) (0.408)
Notes: 
-All bandwidth figures in Mbits/sec 
-17 engines 
Figure 2-4. Functional Partitioning and 1/0 and Interfunction

Communication Bandwidths for the ALS 
2-27 
availability, i.e., must be fault masking with a probability of 0.95, at the end of one week 
on the launch pad. The AlPS configured for the ALS consists of a number of triply or 
quadruply-redundant FTPs connected by a triply or quadruply-redundant InterComputer 
Network. The FTP complement is defined to be fault masking if and only if each FFP is 
capable of correct operation in the presence of any single active Byzantine fault. 
Representative fault masking FTP configurations are a triplex FFP which has suffered zero 
permanent faults and is not in the process of recovering from a transient fault, and a 
quadruplex FTP which has suffered, detected, and masked out the channel containing any 
single permanent fault (becoming in effect a fault-free triplex FT?), and is not in the 
process of recovering from a transient fault. A duplex FTP is not fault masking. The 
InterComputer Network is defined to be fault masking if and only if every FTP can 
transmit a sufficient number of uncorrupted copies of a message to every other FTP in the 
avionics suite to allow voting to generate a correct copy in the presence of any single active 
Byzantine fault in the InterComputer Network. Representative fault masking 
InterComputer Network configurations are manifold and include the case of a permanent-
fault-free triplex IC Network and a quadruplex IC Network which has suffered a single 
permanent Byzantine fault and successfully detected the existence of that fault, in effect 
becoming a fault-free triplex IC Network. 
The launch or the boost phase, i.e., the powered flight segment, is specified to be 
nominally of ten minutes duration. Finally, the launch vehicle enters the desired earth 
orbit. The amount of time in the orbit until the payload is positioned and released may 
vary. In the worst case, it may extend to as long as 48 hours. The maximum allowable 
probability of mission or vehicle loss due to avionics failure for the boost phase and the on-
orbit operations is assumed to be 10-. Because launch occurs only if the avionics system 
is known to be capable of masking a single fault, no single fault can cause the loss of the 
ALS avionics during launch. However, during launch the avionics may fail either due to 
near-coincident multiple faults occurring in a subsystem designed to tolerate only one fault 
at a time, or by the exhaustion of redundant modules in a subsystem such that the 
subsystem can no longer perform its computational or communication functions. An 
example of the former failure mode is the occurrence of near-coincident error bursts on two 
layers of a triplex InterComputer Network. An example of the latter is a series of 
sequential covered faults resulting in the loss of the computational services of an FT?. 
The operational environment for the ALS avionics on the launch pad, in the boost 
phase, and on-orbit is radically different. Different failure rates of components must be 
taken into account to model the system reliability and availability accurately. The avionics 
module failure rates for the three ALS mission phases are tabulated in Section 3. 
2.3.2 Maintainability 
One of the assumptions regarding the operation of the ALS is that launch pad 
avionics maintenance should be avoided as far as possible. However, in the 5% of the 
2-28
launches (the "unavailability" requirement) the avionics would have suffered too many 
faults on the launch pad to provide a fault-masking capability, and it will become necessary 
to perform maintenance while the vehicle is on the launch facility. The avionics must be 
packaged in relatively small units and those units must be located in the vehicle where 
removal and replacement of failed components is facilitated. The Line Replaceable Module 
(LRM) maintenance philosophy adopted by the Joint Integrated Avionics Working Group 
(IIAWG) is one alternative for the maintenance approach. 
The JIAWG has defined support systems and support techniques for vehicles 
which provide easy access to the equipment locations. Much of the JIAWG maintenance 
philosophy can still be used for ALS even though physical access to the ALS Line 
Replaceable Units (LRUs) will probably be not as easy as in an aircraft. 
The modules within the ALS LRUs can be standardized on the Standard Electronic 
Module format E (SEM-E) form factor in accordance with MIL-STD-1389D. These 
modules, depicted in Figure 2-5, are designed for conduction cooled applications and can 
be used in extremely harsh environments (temperature, acceleration, shock and vibration). 
2.3.3.	 Component Quality 
Non-fault tolerant, single-string systems for use in space-borne systems must be 
designed such that the probability of failure is sufficiently small so as to "guarantee" the 
absence of failures during the useful lifetime of the avionics. This level of system integrity 
is accomplished by creating and maintaining a "pedigree" for each item used in the flight 
article. A system's pedigree begins with the use of components which have a traceable 
history. The components are manufactured on well controlled assembly lines where each 
phase of the manufacturing process is reviewed, inspected and certified. Electronic 
microcircuits are manufactured, inspected and tested in accordance with MIL-STD-883C, 
Class S. Discrete components such as resistors, capacitors and transistors are subject to 
equally demanding manufacturing and testing controls (e.g., the JAN-TX quality controls). 
The individual components are then combined to create a subassembly and 
subassemblies are combined to build the components of the system where each phase of the 
manufacturing and assembly process has appropriate inspection and test requirements. 
The AlPS, a fault tolerant system, does not require the "pedigree" constraints 
required for single-string systems. The ability to ensure the placement of a payload in the 
proper orbit is achieved through the redundancy of the system. Since failures are tolerated, 
individual components need not be MIL-STD-883C, Class 5, or JAN-TX quality. A 
comparative cost analysis, described in Section 5 of this report, shows that an effective 
alternative to these components is the use of Class B and JAN-T components provided 
there is a means for properly handling faults. 
2-29
nhJ,I 
-	 k-.58 
U 
Cl.  
5.88
MOO(LAR CONEMR FRAME
TYPE IV, FORMAT £

tilL -C-287S192-2 
LEADED
3.c'FACE MJT

COMPO4Ej"Tg
CROSSOVER FLEX CA8LE

'2 CONTACTS PER SIDE
NA VA 05919 goo 
V	 PROTECTIVE lEADER 
CSDi srw 
n.. 
111111W
.6.42 
POL YIMIDE tLB
30.2 qn. CIRCT./IT AREA
c..	 sriv 
Cta'ECToR TYPE IV, 2S0 CCWTACTS NIL -C2s79,v/oi 2
Figure 2-5. SEM-E Module 
2-30
2.3.4. Radiation Hardness 
A second issue for an ALS avionics suite is radiation hardness. Many space-borne 
systems must be radiation hard where the definition for hardness is determined by the 
purpose of the avionics. The significant difference between the ALS avionics hardness and 
the hardness of an arbitrary payload is the operational lifetime of the two systems. 
A payload boosted to orbit by the ALS may need to be fully operational in a 
relatively high radiation environment for time periods exceeding ten years. The ALS 
mission, however, is complete after it has placed the payload in a specific orbit within a 
few hours after launch, typically 2 hours but in any case no more than 48 hours. Due to 
the extremely short mission times, the ALS radiation hardness is not constrained by a 
technology's total dose capability. The hardness of the ALS avionics will be constrained 
by the Single Event Upset (SEU) rate of the logic family. The AlPS architecture can 
tolerate transients caused by SEUs in the same manner that it can tolerate transient faults. 
However, if the radiation environment and the corresponding SEU rate is sufficiently high 
it can overwhelm the architecture's ability to tolerate these transients. Therefore, 
technologies which are less susceptible to SEUs are preferable even for the AlPS building 
blocks.
Upset rates for the CMOS, Bipolar and GaAs logic families have enjoyed 
continuous improvements in radiation tolerance during the time period of 1985 through 
1990 and it is reasonable to expect continued improvements through 1993. With the total 
dose and transient dose improvements, both demonstrated and projected, the A. avionics 
will not present significant problems with radiation tolerance. 
2.3.5. Power Dissipation 
Operation of the ALS avionics at a launch facility can be supported by auxiliary 
cooling. This cooling could be forced air or a recirculated liquid such as ethylene glycol. 
The avionics must also be capable of operating without degradation after support equipment 
has been withdrawn as well as during the launch and the 48 hours allocated for on-orbit 
maneuvers. System cooling capacity, therefore, will not be determined by the launch pad 
environment but by the ability to cool the avionics while in space. 
The avionics cooling system for use in space will be comprised of one or more 
"cold plates", used to mount the avionics assemblies, and the cold plates will, in turn, be 
thermally connected to radiators via heat pipes. The radiators will be used to radiate the 
heat dissipated by the avionics into space. The thermal capacity of these cooling systems is 
severely limited and the power dissipation of the avionics must be minimized. Typical 
power capacities of radiative cooling systems are two to five Watts per square foot of the 
radiator area. Therefore, the avionic system power dissipation must be minimized. 
Power dissipation minimization for the ALS avionics can be accomplished in three 
ways. The first approach is to use a non-saturating logic family such as CMOS. Other 
logic families such as bipolar are saturating logic and they are characterized by significant 
2-31
power dissipation at all frequencies of operation. The second approach is the minimization 
of the clock frequency of the system. CMOS logic, while it is not a saturating logic family, 
does dissipate power during state transitions. The third approach for power minimization 
is the reduction of the bias voltage on the integrated circuits. 
The total power dissipation of a CMOS-based system can be estimated by the 
following expression: 
pt=(Cpdf,+CLfO)V2 
cc(1) 
where 
Pt = total power dissipated by the system 
Cpd = gate capacitance of the devices 
f1 = internal frequency of operation 
CL = external load capacitance 
f0 = frequency of output signals 
V = supply voltage 
As shown in the above expression, power dissipation is directly proportional to the 
frequency of operation, both internal and external, and the square of the supply voltage. 
The LRUs for the ALS avionics are described in Section 3.5. 
.2.4 Requirements Conclusions 
The overall ALS mission scenario and RMA requirements were obtained from the 
three ALS prime contractors via Martin Marietta Astronautics Group. These included the 
ALS mission phases and durations, the launch availability and the probability of mission 
success.
The ALS computational and communication requirements were obtained from 
Martin Marietta Astronautics Group in the form of directed cyclic graphs depicting inter-
function data dependencies and Hypercard stacks depicting numerical throughput and 
interfunction communication requirements. The requirements were presented as a three-
level hierarchy. 
Nine top-level functions reside at the top of the depiction hierarchy: Central Control 
and Processing, Winds Ahead Determination, Vehicle Power System Management, Steer-
ing and Staging Control, Propulsion Control, Command and Telemetry Processing, Range 
Safety and Destruct, and Programmable Payload Interface. Aggregate throughput estimates 
were given for many of these functions. These aggregates were observed to comprise an 
overestimation of the computational requirements of their constituent functions. The third 
level of the hierarchy represents atomically schedulable computational tasks. Currently, 
Central Control and Processing is the only function for which level-three task requirements 
are available.
2-32
Based on the data provided, the overall ALS throughput requirements were esti-
mated to be approximately 8.8 MIPS for non-propulsion functions and 4.8 MIPS per 
engine for propulsion functions. The inter-FTP communication bandwidth requirements 
were estimated to be 26 Mbits/sec (prelaunch) between the Core FTP and the Propulsion 
FTP(s) (17 engines), the Core FFP I/O bandwidth requirement was estimated to be 11.2 
Mbits/sec, and the Propulsion FI'P(s) I/O bandwidth requirement was estimated to be 
1.587 Mbits/sec (17 engines). 
To more accurately estimate the throughput and bandwidth needs of the ALS 
avionics system the following information is desirable for each task: task frame or iteration 
rate, instructions per frame, throughput margin, processing lag, scheduling requirements 
(i.e., preemptible or nonpreemptible), task execution order and/or data dependencies, and 
inter-function communication requirements (i.e., bits per frame, source or destination of 
data, and latency requirements). The data presented by Martin Marietta served as an impor-
tant starting point for determining the requirements of the ALS avionics. During the course 
of the CSDL-Martin Marietta interaction an active dialogue was set up which would have in 
time resulted in a more complete definition of a common requirements vocabulary and fa-
cilitated the acquisition of comprehensive requirements data. 
2-33
3.0 AlPS ARCHITECTURE OVERVIEW 
Over a period of about six years, the Draper Laboratory has been developing a fault 
tolerant distributed computer system architecture suitable for advanced launch vehicles. 
The overall program objective of the Advanced Information Processing System (ALPS) 
Program, has been to produce the knowledgebase which will allow achievement of 
validated fault tolerant distributed computer system architectures suitable for a broad range 
of aerospace vehicles. The architecture that has been conceived to meet these requirements 
is based on the notion of prevalidated building blocks. 
The ALPS architectural attributes, rules and guidelines, and reliability and 
performance models of the building blocks are described in detail in an accompanying 
report "Advanced information Processing System: Design and Validation Knowledgebase" 
[1]. It also contains references to the more detailed hardware and software specifications 
and simulations that constitute the AlPS knowledgebase. This knowledgebase, which is 
quite large, is required to synthesize a validated ALS avionics architecture. The following 
subsections briefly recapitulates the ALPS virtual and physical architectures and the key 
attributes of the hardware and software building blocks. 
3.1 Building Blocks 
ALPS is amulticomputer architecture composed of hardware and software building 
blocks that can be configured according to certain design rules and guidelines to meet the 
specific requirements of a given application. 	 - 
The hardware building blocks, as shown in Figure 3-1, are Fault Tolerant 
Processors .(FTPs), Networks, and Interfaces. The FTPs are general purpose computers 
which can be built in varying redundancy levels from simplex to quadruplex, using one to 
four identical channels, to meet varying levels of reliability requirements. The networks are 
communication media and are composed of circuit-switched nodes linked together with full 
duplex links. The networks can be configured in various topologies such as a ring, braided 
mesh, irregular mesh, etc. Networks can also be made redundant. Networks are used to 
connect FTPs to input/output devices (these are called I/O networks) and to other FTPs 
(these are called Inter-Computer or IC networks). I/O and IC networks are built out of 
identical nodes and links. The interfaces are the building blocks that are used to interface a 
channel of the FTP to an I/O Network, called the I/O sequencer or 105, and to the IC 
Network, called the IC Interface Sequencer or ICIS. 
The software building blocks are the major software functions: local system 
services, input/output system services, inter-computer system services and the system 
manager. This software provides the services necessary in a traditional real time computer 
such as task scheduling and dispatching, communication with sensors and actuators, etc. 
The software also supplies the redundancy management services necessary in a redundant 
3-1 
FLAW	
PRECEDING PAGE BLANK NOT FILMED
computer and the services necessary in a distributed system such as inter-function 
communication across processing sites, management of distributed redundancy, 
management of networks, and migration of functions between processing sites. 
I.::  
FAULTTOLE 
PROCESSORS 
RANT	 Simplex	 Duplex_.j Tx ' jJ Qdfl1P 
INTERCONNECTION	 [rt4	 > ______ NETWORKS
MESH BRAIDED REDUNDANT RING REDUNDANT 
MESH	 BUSES	 RINGS 
INTERFACES	 EMI In put/Output Interfaces 
EM Inter-Computer Interface Sequencer 
Figure 3-1. AlPS Hardware Building Blocks 
3.2 Virtual Architecture 
One of the important and unique attributes of the AlPS and other Draper-designed 
fault tolerant computers is that redundancy and its management are transparent to 
applications software. Furthermore, even most of the system software is unaware of the 
redundancy of the underlying hardware on which it executes. The only exceptions are 
those system services that are directly responsible for managing the redundancy. (They 
have to know about the existence of redundant hardware in order to manage it.) This AlPS 
attribute allows almost all of the software to be developed and validated on a simplex 
processor, in a software development environment familiar to most programmers and using 
mature tools. We call the architecture, as it appears to the programmer, the virtual 
architecture. 
The AlPS virtual architecture is a conventional multicomputer architecture as shown in 
Figure 3-2. It consists of a number of processing sites each containing an FTP and the 
necessary external interfaces. The processing sites are linked together by an Inter-
Computer or IC bus. An FTP at any particular processing site may also have access to 
varying numbers and types of I/O buses, which are separate from the IC bus. Separate 
buses to carry sensor data and intercomputer data are provided because the bandwidth and 
reliability requirements for these two classes of data in most realtime systems are very 
different. The I/O buses may be global, regional or local in nature. I/O devices on the 
global I/O bus are available to all, or at least a majority, of the AlPS FTPs. Regional buses 
3-2
InterComnutr	 - Global 
I/o
Figure 3-2. AlPS Virtual Architecture 
connect I/O devices in a given region to the processing sites located in their vicinity. Local 
buses connect an FFP to the I/O devices dedicated to that computer. Additionally, devices 
may be connected directly to the internal bus of a processor and accessed as though the I/O 
devices reside in the computer memory (memory mapped I/O). The regional and global 
buses allow sharing of raw sensor data among functions that reside on different processing 
sites, thus reducing the overall system cost. They also allow functions to be migrated 
between FTPs in real time in the event of faults, damage or change in mission phase and 
work load. The memory mapped I/O is used to access time critical sensors to meet 
stringent transport lag requirements for real time control applications. (Transport lag is the 
time elapsed from reading a set of sensors to asserting an actuator command in response to 
that input). 
The virtual architecture of a processing site is shown in Figure 3-3. It consists of 
three sections: a computational section, an I/O section, and the resources shared between 
them. The computational and I/O sections are identical, conventional processor 
architectures. Each consists of a processor, memory, interval timers and memory mapped 
I/O (which is unique to each processor). Although identical in hardware design, the 
3-3
computational processor or CP is typically devoted to application functions (such as 
executing vehicle control law) while the I/O processor or the lOP is devoted to I/O 
functions such as reading and validating sensors and sending out actuator commands. The 
CP and TOP communicate with each other via the shared memory. Other resources shared 
by both processors include a data exchange mechanism which is used to exchange and vote 
data with other redundant channels in this FTP, a real time clock and interfaces to several 
I/O buses and the IC bus. 
3.3 Physical Architecture 
The parameters that define the physical architecture include redundancy levels of 
FTPs, interconnections of redundant channels in an FTP, redundancy level of sensors, 
actuators and other I/O devices, cross-strapping of I/O devices to channels of FTPs and 
redundancy level of their interfaces, redundancy levels of IC and I/O networks and their 
physical topologies. This section highlights some of the salient points of the architecture. 
Figure 3-4 shows the physical architecture of the quad redundant AlPS FFP. It is 
designed strictly according to the fundamental fault tolerance theory. It complies with all 
the requirements for tolerating two sequential Byzantine failures of Fault Containment 
Regions (FCRs). In addition to redundancy, other features that provide hardware and 
software fault tolerance include watchdog timers, processor interlocks, a privileged 
operating mode, hardware and software exception handlers, and self test& A majority of- 
correctly operating channels can disable all outputs of a failed channel using the processor 
interlock mechanism. A channel that is failed active is thus prevented from transmitting 
erroneous data or commands on I/O networks and IC networks or to local I/O devices. 
Figure 3-4 also illustrates how redundant sensors are connected to the redundant 
FTP channels. In this example, three redundant copies (Si, S2, S3) of a sensor S are 
attached to three of the four FT? channels. No cross-strapping of the sensors to FTP 
channels is shown for simplicity, although it is possible and likely for some critical 
sensors. The process by which all four FFP channels derive a congruent value of the 
sensor S is as follows. Channel A reads sensor Si and all four channels then execute the 
two-round Byzantine resilient exchange algorithm which culminates in all four channels 
receiving a congruent value of Si, say Vi. The process is repeated for sensors S2 and S3. 
Now all four channels have the same three sensor values, say, Vi, V2, and V3. To obtain 
a valid sensor value V, the three sensor values must be compared and voted. However, a 
bit-for-bit voting of redundant sensors is usually not possible since sensors measure real 
world parameters such as pressure, temperature, angle, acceleration,. etc. which are all 
analog quantities. Even under no fault conditions, digital representations of redundant 
sensor values differ from each other. That is, the values Vi, V2, and V3 may be different 
even though the sensors Si, S2, and S3 are all operating correctly. However, since they 
do represent real world physical quantities, a number of reasonableness checks such as rate 
3-4.
1/0 Devices rio Devices 
_	 ___
FTP1 
H IMH 
CPBus I
Shared Bus
?.Pu.
InterComputer
lOP Bus 
Real	 I Shcd I Da
IEgIlMyI 2
Bus 
1/0 Devices Devices
-
=I—jFrP3 
Legend: FTP Fault Tolerant Processor	 Bus 1 Bus 2 
CP Com putational Processor 
lOP I/O
	
occssoi 
LOS I/O Sequencer 
ICIS InterComputer Interface Sequencer 
Figure 3-3.	 AlPS Processing Site Virtual Architecture 
FR I 
cpui1 CPU2 
Local
	
Local Memory	 Menoty FCR 5 
LocaW)	 LocaU ______ 
Sensor SHARED HARDW INTERSTAGE 
Communicator Channel A 
---k ICIS 
Fault Tolerant Clock
 
f-:k
Shared Memory 
FCR2 """ 
FCR6 
Sensor
CPUj CPU2 
Jared Channel BHardware Offi _____ 
CPU2 FCR 7
Sensor
i
Channel C Hardware 
— — — — — - — — — — — — — FCR 4 FR 8 
I,ICPU2 I	 I I RS I At, 
1:. jhared Hardware __ :  Channel D
Figure 3-4. AlPS Quad Redundant Fault Tolerant Processor:

Fault Containment Regions and Interconnections 
3-5 
of change and minimum/maximum range of values can be used to filter out a grossly 
misbehaving sensor. Mid-value select, average or mean values of the remaining sensors 
can then be used to arrive at a valid sensor value in all channels. Note that the value will 
also be congruent since all channels execute identical sensor redundancy management 
algorithm with congruent sensor inputs. 
The physical realizations of the virtual I/O and IC buses are the fault and damage 
tolerant circuit-switched networks. A network consists of a number of full duplex links 
that are interconnected by circuit-switched nodes. In steady state, the circuit switched 
nodes route information along a fixed communication path, or "virtual bus", within the 
network, without the delays which are associated with packet switched networks. Once the 
virtual bus is set up within the network, the protocols and operation of the network are 
similar to typical multiplex buses. 
Although the network performs exactly as a bus, it is far more reliable and damage 
tolerant than a linear bus. A single fault or limited damage can disable only a small fraction 
of the virtual bus, typically a node or a link connecting two nodes. By reconfiguring the 
network around the faulty element, a new virtual bus is constructed. The nodes are 
sufficiently intelligent to recognize reconfiguration commands from the network manager 
(explained in System Services Section) which is resident in one of the FTPs. The network 
can also be expanded very easily by adding more nodes linked to spare ports in existing 
nodes.
To maintain the fault tolerance requirements, each FTP channel receives data from 
all three intercomputer network layers but can physically transmit on only one layer, as in 
the AlPS Engineering Model shown in Figure 3-5. All three layers of the IC network are 
used together when transmitting and receiving data. Since all channels of a triplex site are 
executing the same code synchronously, all three channels (each channel transmitting on a 
different layer) transmit identical messages. Thus, within some skew, the redundant layers 
of the network contain the same message. This allows the receiving site to vote the three 
layers, masking any failure. Although always receiving on all three layers, duplex sites can 
transmit on only two of the three layers of the network, and simplex sites on only one of 
the three layers. Thus, malicious failure of a channel can disrupt only one layer. An 
example of such a failure is a continual broadcast (babbling) by a channel on a network or 
intermittent transmissions that collide with legitimate transmissions by other FTPs on that 
network layer. 
For access arbitration purposes, the triplex network is treated as a single entity. 
FTPs, regardless of their redundancy level, contend for all three layers of the network. At 
the end of the contention sequence one, and only one, FTP may have access to all three 
layers of the network.
3-6
The IC networks, the FTP interfaces to the networks (ICISes), and the arbitration 
logic are designed in strict accordance with the fault tolerance theory [7]. The system 
provides error-masking capability for intercomputer communication between triplex (or 
higher redundancy level) FFPs. An arbitrary hardware fault, including Byzantine faults, 
anywhere in the system can not disrupt communication between FFPs of triplex or higher 
redundancy level. 
3.4 System Services 
Each processing site has the ability to operate autonomously, particularly for the 
performance of critical functions. However, the System Services allow the coordinated use 
of the entire information processing system to provide attributes superior to the more 
federated systems that are typical for current aerospace vehicles. 
The Local System Services in each FTP include FTP initialization, a real time operating 
system, local resource allocation, FTP fault detection, isolation and reconfiguration 
(FDIR), and local time management. The real time operating system supports task 
execution management, including scheduling according to priority, time and event 
occurrence, and is responsible for task dispatching, suspension and termination. It uses 
the vendor-supplied Ada Run Time System (RTS), and includes additional features 
required for the AlPS real time distributed operating system. 
• FDIR has the responsibility for detecting and isolating hardware faults in the CPs, 
lOPs, and shared hardware. It is responsible for 'synchronizing both groups of processors 
in the redundant channels of the FTP and for disabling outputs of failed channel(s) through 
interlock hardware. The CPU hardware exception handling and downmoding/upmoding 
hardware in response to configuration commands from the system manager are also 
performed by the FDIR function in the FTP. It is also responsible for transient hardware 
fault detection and for running low priority self tests to detect latent faults. The local time 
manager works in cooperation with the system time manager to keep the local real time 
initialized and consistent with the universal time. 
The I/O system services provide efficient and reliable communication between the 
user and external devices (sensors and actuators). The I/O system services software is also 
responsible for the fault detection, isolation and reconfiguration of the I/O network 
hardware and the lOS. 
The IC user communication service is designed along the ISO's seven layer Open 
Systems Interconnect model. It provides local and distributed inter-function 
communication (point to point or broadcast mode) which is transparent to the application 
user. It provides synchronous and asynchronous communication, performs error detection 
and source congruency on inputs, and records and reports IC network errors to the IC 
network layer managers. The IC network manager is responsible for the fault detection, 
isolation and reconfiguration of the inter-computer network. 
3-7
AlPS ENGINEERING MODEL CONFIGURATION 
I/O NETWORK 
15 NODE CONFIGURATION 
FTP3	 FTP2 
Figure 3-5. AlPS Engineering Model Configuration 
3-8
ORIGINAL PAGE

COLOR PHOTOGRAPH
The system manager is a collection of system level services: The system resource 
manager allocates migratable functions to FTPs. This involves the monitoring of the 
various triggers for function migration such as failure or repair of hardware components, 
mission phase or workload change, operator or crew requests and timed events. The 
system fault detection, isolation and reconfiguration (FDIR) is responsible for the collection 
of status from the inter-computer (IC) network managers, the I/O network managers, and 
the local GPC redundancy managers. It resolves conflicting local fault isolation decisions, 
isolates unresolved faults, correlates transient faults, and handles processing site failures. 
The system time manager, in conjunction with the local time manager on each FTP, has the 
job of maintaining a consistent time across all FTPs. 
3.5 Flight System Characteristics of Building Blocks 
3.5.1 Functional Building Blocks 
The AlPS hardware building blocks are the Fault Tolerant Processors, the 
Input/Output and InterComputer Networks, and the network interfaces as shown in Figure 
3-1 and are used to implement the AlPS virtual architecture and the FTP virtual architecture 
shown in Figures 3-2 and 3-3, respectively. The building blocks are comprised of the 
Fault Tolerant Processor Channel and the Communications Node and each of these are 
composed of smaller functional elements or modules. These hardware building blocks and 
the modules which make up the building blocks are described in the following paragraphs 
and are based on the technology projections-for the 1992-1993 time frame. Additionally, 
the implementation considerations for performance, radiation hardness, power dissipation 
and maintainability outlined earlier in the requirements section have been incorporated into 
these building blocks. 
3.5.1.1. Fault Tolerant Processor Channel 
The Fault Tolerant Processor Channel is shown in Figure 3-6 and incorporates the 
FTP functions identified in Figure 3-3. The channel is comprised of two Central Processor 
Unit (CPU) modules, a Shared Devices module, one or more Input/Output Sequencer 
(lOS) modules, one or more InterComputer Interface Sequencer (ICIS) modules, and the 
Communicator and Interstage module. The channel's modules are interconnected using a 
shared bus which is common to all modules in the channel. These modules are described 
in Subsections 3.5.1.1.1 to 3.5.1.1.5. 
Taken together, the hardware in a quad-redundant FTP will support at least a FAIL-
OP. FAIL-OP mode of operation. A quad-redundant FTP will continue to operate correctly 
after two sequential, arbitrarily malicious faults in any two FCRs. Additionally, it can also 
survive certain combinations of triple and quadruple FCR failures. For example, sequential 
failures of two processors in different channels and their associated Interstages will be 
masked. After these four failures have occurred, the FTP will continue to function 
3-9
:	 a 
lib
Network Signals
Input / Output
Sequencer
(lOS) 
Communicator
Module
correctly in a duplex mode (two channels still functional and voting inputs, state data and 
outputs). The fifth failure can be detected with near perfect coverage resulting in FAIL-
STOP capability. Alternatively, if the operational requirements are such that it is desirable 
to continue operation in a simplex mode, then the probability of FAIL-OP after the fifth 
failure is directly related to the completeness of a channel's self-tests (self-test coverage). 
3.5.1.1.1. Processor 
The processor will be based on the MIPS, Inc., R3000 or similar performance 32-
bit RISC architecture. The nominal frequency of operation will be 40 MHz which can 
provide 12 to 18 MIPS peak performance (DAIS mix). The CPU module will have both 
Random Access Memory (RAM) and Read Only Memory (ROM) and the combined total 
will be between one and four megabytes. The processor module will also have 
counters/timers used for support of the operating system and an interrupt controller. A 
local bus interface will be provided for expansion of the ROM if the size of the operating 
system and application software should require additional memory space. 
CPU Module	 CPU Module	 Shared 
1
	
2
	
Devices 
Channel's Shared Bus
 -	 Interstage	 L 
__	 Module 
lnterctanne[	 4 4 Iriterchannel Signals	
I---1---i- Signals 
InterComputer
Interface
Sequencer (ICIS)
InterComputer
Network 
Figure 3-6. Fault Tolerant Processor Channel 
The processor, executing instructions at the 40 MHz clock rate, must be provided 
instructions and data at that clock rate. Memory devices with access times less than the 25 
nsec clock period, however, will continue to have relatively low bit densities (16 KBits to 
3-10
256 KBits per integrated circuit) and their active power dissipations will be high (one to 
two Watts per integrated circuit). The CPU module's performance and power dissipation, 
therefore, will be determined by the memory architecture implemented on the module. 
While it is possible to implement a cache memory system (very low access time 
memory implementation) to support the fast cycle times of the processor integrated circuit, 
the relatively small memory space required in the AlPS for ALS can be implemented such 
that all of the memory appears to operate as a cache. This can be accomplished by 
implementing the memory array as 64 bit or 128 bit wide memory rather than the 32 bit 
words needed by the integrated circuit. For example, if a memory array is implemented as 
128 bits wide then the memory array is 64 KWords long (each Word is now 16 bytes). 
When a memory read cycle is accomplished, 128 bits of data or instruction are read from 
the memory and are held in a memory read buffer. Subsequent memory accesses are 
checked to determine if they are from a memory location contained in the buffer. If the 
access is from a location contained in the buffer then the information is provided to the 
microprocessor without the penalty of another memory access. Memory write cycles must 
be completed by a physical write to memory. 
The other circuitry on the CPU module (timers, counters, interrupt controllers, etc.) 
are common to most microprocessor based systems used in real-time control systems. 
Their use in the CPU module is comparable to these other applications and to the AlPS 
proof-4-concept system. No special provisions are required for these functions. 
3.5.1.1.2. Shared Devices 
System level support circuits to be included on this module are the channel's shared 
memory (RAM), real-time clock, timers and interrupt controller. 
The shared RAM in the channel is used for interprocessor communications and for 
intermediate storage of a limited quantity of data. The use of this memory as a 
communications area adds the derived requirement that its access time, including any 
shared bus overheads, be minimized. This memory, therefore, must have access times 
which are as short as possible. Unlike the memory to be implemented on the processor 
module, this memory space will be subjected to random accesses to data structures located 
in the shared space and no benefits can be accrued by providing a multiple word access 
capability. 
3.5.1.1.3. ICIS and lOS Hardware 
The InterComputer Interface Sequencer (ICIS) and Input/Output Sequencer (lOS) 
provide the FTP channel's interface to the redundant, fiber-optic communication paths 
needed in a network of fault tolerant computers and input/output devices. While the 
purposes of the IC network and the I/O network are different, the communications protocol 
used in each network is identical and common hardware will be implemented to interface to 
3-11
both networks. A block diagram of this Network Interface Sequencer (NIS) module which 
replaces the functions of the ICIS and lOS is shown in Figure 3-7. 
The communications protocol used on the networks will be based on the Fiber 
Distributed Data Interface (FDDI) standard with a bit rate of 100 MegaBits per second 
(MBPS) which is 50 times the 2 MBPS data rate of the AlPS engineering model network. 
FDDI is normally implemented as two counter-rotating physical rings and access to the 
rings is controlled using a token passing protocol as defined by IEEE-Std-802.5 (see 
Figure 3-8). The implementation of the IC and 110 networks for the ALS will be a 
modification of the FDDI standard where a modified Laning Poll will be used to access the 
network of virtual buses rather than the FDDI token.
PROTOCOL
 INPUT L _-	 RCVR DECODER OUTPUT
RECEIVE TRANSMIT 
BUFFER BUFFER 
PROTOCOL INPUTM
DECODER
RECEIVE 
BUFFER CHANNELS
-
BUS 
INTERFE 4TERNALBUS 
• PROTOCOL 
INPUT N	 RCV DECODER
RECEIVE 
BUFFER • OUTPUTS 
• STATE • TO 
EXCHANGE ADJACENT 
CHANNELS 
NETWORK ACE (KS and 10S)
INPUTS 
FROM 
ADJACENT 
CHANNELS 
Figure 3-7. Network Interface Sequencer 
The FDDI token consists of three bytes and its construction is specific to ring type 
networks. Two bytes are the starting and ending delimiters and these are used for phase-
locked-loop (PLL) synchronization as well as marking the beginning and ending of the 
token frame. The third byte is the access control byte and it contains the token bit as well 
as other bits for monitor, priority and priority reservation. The monitor bit is used by the 
ring monitor (a station responsible for monitoring the operation of the ring) to detect 
"orphan" frames and thereby remove them from the network. The priority and priority 
reservation bits are used to establish the operating level of the network and to make 
"reservations" for future data transfers.
3-12 
The Laning Poll supplants the FDDI token and is an extension to the poll 
implemented in the AlPS engineering model. The two significant modifications to the 
Laning Poll is the addition of the Q-Bit, which is required for quad-redundant processing 
sites, and the reduction in the arbitration timing. 
The proof-of-concept system was built with triplex, dual and simplex processing 
sites and the Laning Poll was implemented accordingly; there were no provisions for the 
addition of quad redundant processing sites in the engineering model. The Q-Bit, which 
precedes the T- and D-Bits, allows a quad redundant processing site to poll for control of 
the network and, because the most significant bits are transmitted first, do so at a high 
priority than other processing sites with lower degrees of redundancy. The other bits of the 
Laning Poll will have the same significance and will operate like the existing network. The 
algorithm for the poll is: 
FOR ALL i WHILE contending DO: 
TRANSMIT P1 on the NETWORK 
IF P 1 = 1 and RECEIVED = 1 then CONTINUE VIE 
IF P1 = 1 and RECEIVED = 0 then VIE _WON 
IF P1 = 0 and RECEIVED = 1. then VIE—LOST 
IF P1 = 0 and RECEIVED = 0 then CONTINUE VIE 
Figure 3-9 depicts the network arbitration timing for the AlPS engineering model 
system and the timing for an advanced technology ALPS with a comparable number of 
processing sites and Communications Nodes. The incorporation of technology projected to 
be available in 1992 will permit an improvement in the arbitration timing in excess of one 
order of magnitude. An ALPS network sized for the Advanced Launch Vehicle, however, 
requires far fewer processing sites and Communications Nodes and will require even less 
arbitration time. Overall, the network can be arbitrated in 1116th to 1148th the time required 
for the engineering model, depending on the number of nodes, as shown by the timing 
values in Figure 3-9. 
The physical layer of the network is the FDDI implementation. It does not use 
Manchester encoding because this form of data transmission requires frequencies which are 
twice the data rate (200 MHz for 100 MBPS). The encoding scheme used at the physical 
layer is "four out of five" where each group of four symbols (binary digits) is encoded as a 
group of five bits on the medium. Sixteen of the 32 possible combinations are for the data 
bit patterns, three combinations are for delimiters, two are for control, three are for 
hardware signaling, and eight are reserved for future use. 
3-13
I SD I AC I ED I 
Token Frame Format 
I SD I AC I FC I	 DA	 I	 SA	 I	 Data	 Checksum I ED I FS I 
Data Frame Format 
SD - Starting Delimiter (1 byte) 
AC - Access Control (1 byte) 
FC - Frame Control (1 byte) 
DA - Destination Address (2 or 6 bytes) 
SA - Source Address (2 or 6 bytes) 
Data - Data Field (any number of bytes) 
Checksum - Checksum for DA+SA+Data (4 bytes) 
ED - Ending Delimiter (1 byte) 
	
FS- Frame Status (1 byte) 	 FDDI 
IEEE-Std-802.5 
FOOl with Laning Poll 
	
I Laning Preamble l SD I AC I FC I	 DA	 I	 SA	 I	 II I Checksum I ED I ES I 
Data Frame Format (same as above)	 11
Laning Preamble 
	
Start Bit I Q-Bit I 1-Bit I D-Bit I
	
Priority Bits	 I	 Device ID Bits	 I 
Start Bit - Mark the beginning of the Poll Sequence 
Q-, 1- and D-Bits - Identify Quad, Triplex and Duplex processing sites 
Priority Bits - Three bits specifying the priority of the request 
Device ID Bits - Six bits specifying the ID of the processing site 
Figure 3-8. IC and 110 Network Data Frame Format 
The Network Interface Sequencer module, when used as the interface to the 
InterComputer Network, will be connected to the three network layers as was done in the 
proof-of-concept system. The I/O network does not have the redundant layers, however, 
and the module design, with the three inputs and the state exchange logic, does not appear 
to be capable of operating with the single layer I/O network. The common interface module 
design will interface to the I/O network if a fiber optic beam splitter is used to provide the 
three parallel inputs required by a MS module, and the state exchange outputs are "looped 
back" so the state exchange inputs are monitoring their own exchange outputs. In this 
way, an interface module connected to the I/O network will "vote" the single set of input 
data and the state machines and state exchange hardware will be "synchronizing" with 
itself.
3-14
	48 usec	
.I24 use%
,
 j24 usec[.1 24 usec . j .1	 216 usec 
Start Bit
Priority and ID 
	
3 usec	 1.5 usec_ i .1.5 use i .1.5 use	 13.5 usec 
	
Start Bit	 I	 Q-Bit	 I	 T-Bit	 I	 D-Bit
Priority and ID 
	
1 usec	 i .0.5 user- i 0.5 useci u 0.5 usec i
	 4.5 usec 
	
Start Bit	 I	 Q-Bit	 I	 T-Bit	 I	 D-Bit
Priority and ID 
Figure 3-9. Network Arbitration Timing 
3.5.1.1.4. Communicator and Interstage 
The communicators, resident on each shared bus and thus shared by the processors 
in their respective channels, together with the Interstages are used to tie the four channels of 
the FFP together in a manner that guarantees that each FTP processor can maintain 
congruent inputs, state information and outputs. The interchannel hardware, the 
communicators and the interstages, is used to exchange and vote data, fault-tolerant clocks, 
and external interrupts. 
The interstages provide the additional Fault Containment Regions (FCRs) and the 
connectivity required for an efficient implementation of the two round communication 
algorithm required to correctly handle Byzantine failures. To ensure the integrity of these 
additional FCRs, the interstages are powered by power supplies independent of the power 
supplies used to power the processors and shared hardware. In addition, all inter-FCR 
connections are made using fiber optic drivers, receivers and cables so that the integrity of 
the FCRs cannot be compromised. This electrical isolation prevents a fault in any given 
FCR from migrating past the boundaries of that region and corrupting other FCRs. 
3-15
The communicator and interstage function will be based on the corresponding 
functions developed in the ALPS engineering model. The three fundamental purposes of 
the communicator and interstage hardware and its fiber optic data communication paths are: 
1) Provide the paths for distributing data in one channel to all other channels; 2) Provide a 
mechanism for comparing results of the redundant channels; and 3) Provide a path for 
distributing and comparing timing and control signals such as the fault tolerant clock and 
external interrupts. As shown in Figure 3-6, the communicator and interstage functions 
reside in each channel and are accessed via the channel's shared bus. 
Two types of data exchanges are possible. These are simplex exchanges and voted 
exchanges. The simplex exchange is used to distribute copies of data from one channel to 
all other channels in a congruent fashion. An example of such a data item is the value of a 
sensor that is available in only one channel. Voted exchanges, on the other hand, are used 
to compare and vote results of the redundant channels. An example is an actuator 
command produced by a control law implemented in all channels, which is to be voted 
before the command is issued to the actuator. 
Data will be exchanged between the redundant channels one 16 bit word at a time. 
To perform a voted exchange, each processor writes the value to be voted in the transmit 
register. Writing to this register initiates a sequence of events in hardware which 
culminates with the voted value being deposited in the receive register of each processor. 
The processors in each channel can read the receive register at this point to fetch the voted 
value.
A significant portion of the existing communicator function has been implemented 
as an Application Specific Integrated Circuit (ASIC). The ASIC, a 6000 gate, 2.0 micron 
CMOS Configurable Gate Array, implements the data communicator function but the clock 
and interrupt communicator functions are implemented separately. The implementation for 
the Advanced Launch System will use a 20,000 gate, 0.8 micron CMOS gate array and will 
incorporate the data, clock and asynchronous interrupt communicator functions. The data 
paths will be serialized/deserialized using logic also on the new gate array. The serial data 
rate will be 128 MBPS which supports peak data exchange rates of eight megabytes per 
second.
The interstage functions will also be implemented using a 20,000 gate 0.8 CMOS 
gate array. This gate array will integrate the data, clock and asynchronous interrupt 
interstage functions. The communicator and interstage functions are all very similar and, as 
such, the two ASIC will have approximately 95% common designs. 
Figure 3-10 shows the block diagram of the communicator and Figure 3-11 shows 
the block diagram of the interstage. Critical to the proper operation of the communicator is 
the fault masking action inherent in the design of the hardware voter. The voter compares 
the four parallel data streams on a bit-by-bit basis and produces a "majority" output bit for 
3-16
each input bit. Any disagreements in the voting process are detected and the identity of the 
disagreeing input is stored in an error register. There are five bits in the error register: one 
bit for each of the four inputs and a fifth bit which is set to indicate that a pairwise split was 
detected. In the presence of a single fault, the voter produces the correct result and latches 
the identity of the faulty input. 
A voter with four inputs can tolerate two faults if and only if it is has been 
configured to vote on only three inputs (assuming one of the faulty inputs is the input being 
ignored). Therefore, associated with each voter in the communicator is a voter mask 
register. This register contains a four bit mask used by the voter and permits voting of data 
only from the unmasked inputs. 
The AlPS approach to achieving exact consensus is to use identical software 
running on identical, redundant, "clock deterministic" hardware operating in tight, micro-
To My Interstage 
To Left, Right 
ow	 and Opposite 
Communicators 
Now-rI I0 LRO  I 
Inputs from Other
	
I - 
Communicators	 I i I I
MI1II 
Inputs from	 L II I 
Interstages	 Ru	 VOTER 
01
	 a—
I Voter Mask I 
Figure 3-10. FTP Communicator 
3-17
Line Drivers 1M 
From	 L 
My Communicator	 Line Drivers
To 
R	 Communicators 
Line Drivers 
Line Drivers 0 
Figure 3-11. FTP Interstage 
frame synchrony. As noted earlier, for hardware to be "clock deterministic", it must 
perform each of its operations in a fixed and predictable number of clock 
cycles. Synchronous operation of that hardware then allows an efficient, hardware oriented 
solution to the Byzantine Generals Problem. It relieves the software of the burden of 
maintaining process synchrony. In addition, it allows the necessary voting to be performed 
in hardware in a straightforward manner. This means that errors can be detected and 
masked as they occur, transparently to the application software. 
One way to achieve synchronization of the redundant hardware would be to drive it 
with a single clock. However, a common clock would represent a potential single point 
failure. Therefore the AlPS FTP uses redundant clocks. To maintain synchrony, the clock 
signals are digitally phase-locked to one another. To ensure that they are resilient to 
Byzantine failures, they are exchanged and voted in a manner similar to the way data is 
exchanged and voted. In this case, exact consensus between channels is achieved on the 
relative phase of these clock signals and they are collectively known as the Fault-Tolerant 
Clock (FTC). 
Each channel of the AlPS FTP maintains its own version of the FTC. Like the rest 
of the FTP, the FTC hardware is partitioned into eight FCRs as depicted in Figure 3-12. 
However, the network topology used to interconnect these FCRs is different. Unlike the 
communicators described above, each FTC communicator broadcasts its version of the 
FTC to all FTC interstages because each channel's version of the FTC is a separate, 
simplex clock source which must be exchanged and voted. Requiring each FTC 
communicator to broadcast its version of the FTC to every FTC interstage, four simplex 
source exchanges are performed in parallel with an additional voting plane at the 
interstages. The voted clocks produced at each FTC interstage are then broadcast back to 
all FTC communicators where they are voted the second time. 
The frequency of the FTC is nominally 2 MHz which is too slow to drive most of 
the hardware (for example, the microprocessors require a 40 MHz clock). Therefore, a 
faster clock that is deterministically related to FTC is required to drive the hardware in each 
3-18
channel. In the AlPS engineering•model, that faster clock is a 16 MHz clock which, in 
turn, is. used to produce FTC. For the ALS, the faster clock will be the 40 MHz clock 
needed by the microprocessors and their supporting logic. 
3.5.1.1.5. FTC and 40 MHz . Signal Determinism 
During normal operation, each channel's FTC is continuously compared to the "majority" 
of all FTCs and adjusted accordingly. If a channel's FTC is in phase with the "majority", 
no action is taken. If the channel's FTC is too slow, its period is shortened by a known 
amount, and if it is too fast, its period is lengthened by the same known amount. The 
"majority" can best be defined by illustrating an example. Figure 3-13 shows four clock 
signals and the resulting "majority" that would be produced by the clock voter algorithm 
used in the AlPS engineering model. Under steady state, no fault conditions, the output of 
the voter algorithm is a very slightly delayed version (equivalent to the gate delay through 
the voter itself) of the second fastest input to the voter. The voter is performing a second 
edge detect algorithm. The "majority" output will only transition from one state to the next, 
following the second input signal to transition to the new state. 
In order to maintain a deterministic relationship between FTC and the 40 MHz 
signal in each channel, whenever an adjustment is made to a channel's version of FTC, 
corresponding adjustments are made to that channel's 40 MHz clock. Clock pulses in the 
40 MHz pulse train are either unaltered or stretched (one 20 MHz clock pulse in place of 
two 40 MHz clock pulses) to guarantee that there will always to be a deterministic number 
of clock pulses in each half period of the FTC. When a channel's FTC is in phase with the 
"majority", one 40 MHz clock pulse is stretched in that channel's 40 MHz clock signal (19 
clock pulses in one period of FTC). Whenever the channel's FTC is late, Case 3 in Figure 
3-14, this pulse stretching is suppressed, shortening the FTC period by one 40 MHz clock 
cycle while retaining 19 clock cycles in one FTC period. If the channel's FTC is too early, 
Case 2 in Figure 3-14, a second stretched clock pulse is included in the FTC period again 
keeping 19 clock pulses in one FTC period. 
3.51.2. Communications Node 
The circuit switched nodes of the IC and I/O networks are identical and are based 
extensively on the nodes developed for the ALPS engineering model. The Communications 
Node (CN) and the operation of each set of Port Logic is portrayed in Figures 3-15 and 3-
16. Each node has five identical ports (only one port is shown in Figure 3-15) where a 
port's input and output are connected via fiber optic cables to a device's output and input 
port, respectively. 
The FDDI data frame, described above, includes fields for the destination and 
source addresses. Each device connected to the IC and I/O networks (ICIS, lOS and CN) 
3-19
CPU and FTC 
Communicator	 Communicator 
CPU and FTC
	
Communicator	 Communicator 
	
CPU and FTC	 FTC	 CPU and FTC 
	
Communicator	 Communicator 
CPU and FTC 
	
Communicator	 Communicator 
Figure 3-12. FTP Fault Tolerant Clock Topology 
I	 I	 I I	 I	 I 
Ii	 I'	 Ii FTC1  
I	 I	 I 
I	 I I	 I	 I	 I	 I 
	
FTC2	 •	 I	 I 
I	 I	 I 
FTC3 _  
I	 I	 I 
I	 I 
FTC4 ii  •	 i	 i	 I	 • 
I	 I	 I 
I	 I 
Majority
I	 I I	 I
Figure 3-13. FTC "Majority" Voter Operation 
3-20
will be provided with a unique 16-bit identification code and, during communications, the 
Destination Address (DA) field will be set to the desired destination's identification code. 
The Source Address (SA) field will always be used to identify the originator of messages 
and thereby specify which device should receive any required response messages. 
3.5.1.2.1. Node Controller 
There is one Node Controller in each CN and it is responsible for CN initialization, 
monitoring the communications traffic for CN messages, and enabling or disabling the Port 
Logic regeneration logic and transmitters as directed by the messages received via one or 
more Port Logic receivers. 
MAJORITY
Case 1: FTC in Phase with MAJORITY 
0 1 2 3 4 5 6 7 8	 9	 10 11 12 13 14 15 16 17 18 19 20 
4OMHzJ]J1J1J1J1J1J1J1J1J[J1J1J]J1J1J1J1J1J1j1JL 
I	 I	 •1 
vrc i 
40MH	 Case 2: FFC Ahead of MAJORITY 
0 1 2 3 4 5 6 7 8 9	 10	 11	 12 13 14 15 16 17 18 19 20 JLnJLftHJ1JLflflflJLft1uLftrLnJ1JLn 
II	 I 
FTC
Case 3: FTC behind MAJORITY 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
40MHzJ1J1J1,J1J1J1J1J1J11J1J1J1J1J1J1J1J1J1J1J1JL 
I	 I 
FTC__ 
HMajority	 FTC In Phase 
Figure 3-14. Clock Corrections 
3-21
The Port Logic decodes all messages received on any input port and compares each 
DA to the 16-bit CN identification node. If the 16-bit address does not match the code for 
the CN, the message is automatically discarded by the Protocol Decoder. If the DA 
matches the CN's code the Protocol Decoder stores the message in the Receive FIFO and 
sets appropriate status bits to indicate the reception of a message the validity/invalidity of 
the message based on the Checksum. The Node Controller will then decode the message to 
determine if it is a valid message and, if it is a valid message, respond accordingly. If a 
reply message is required, the SA included in the original message will be used as the DA 
in the reply.
PORT LOGIC 
(10F5)  
INPUT 
PORT 
DATA
	 RCVR	 REPEATER
	
OTHERI DATA I
	 TO -
	 I	 PORT 
PORTS
+1 
	
DECODER CLOCKFROM	 REGENERATION 
	
OTHER	 LOGIC 
EXTRACT 
RECEIVE  
FIFO 
•	 I	 I	 I - 
----------------- 
NODE CONTROLLER	 I 
P  
 PORT 
ENABLE 
	
MESSAGE	 ENABLE SEQUENCER	
NODE	 I I	 PORT
 
	
BUFFER	 REGISTER 
Figure 3-15. Communications Node 
Examples of messages to be sent to CNs include requests for status and direction to 
perform CN reconfiguration. The Network Manager, resident in selected processing sites, 
can request status as an input to its network monitoring task and issue reconfiguration 
messages to establish communication paths or to change the port enable status of the CNs. 
3.5.1.2.2. Communications Port Receiver/Transmitter 
Each of the five CN port receiver/transmitter are identical and Figure 3-16 is the 
block diagram of these ports. Every port has a unique Port Enable signal which enables or 
disables the path from the receiver and the path to the transmitter. If the port is disabled 
DATA 
TO 
OTHER 
PORT 
LOGIC 
3-22
and is failure free, whatever data presented to the receiver is not repeated and whatever data 
is presented to the data encoder is ignored. 
If the port is enabled and is failure free, the data presented to the receiver is repeated 
and sent to the regeneration logic in the other four ports. The enabled port's regeneration 
logic will encode a data stream which is the logical-OR of the other four port's receivers 
and the Node Controller transmitter. 
TO NODE	 PORT 
----CONTROLLER	 ENABLE 
4------------J. ----------------------1 
• PORT LOGIC 
(1OF5 
	
PORT	 DATA	 TO 
	
INPUT
	
RCVR	 REPEATER	 ;OTHER 
	
FROM	 .	 I PORT I DATA I 
'	 I 
OTHER 
	
PORTS 
	 ' I	 1AL	
ENCODER	 IEE!>	 - OUTPUT DATA 
I	 I1 I	 I	 U REGENERATION 
FROM	 LOGIC 
NODEU. --------------------------------------------
- CONTROLLER
Figure 3-16. Node Port 
3.5.2. Projected Performance Parameters 
Some performance data was included in the above paragraphs. This section will 
summarize the parameters that affect the performance of the ALPS building blocks when 
they are implemented using the technologies projected to be available in the 1992-1993 time 
frame. The resulting performance projections are summarized in Section 3.5.4. 
3.5.2.1. Fault Tolerant Processor Channel 
The Fault Tolerant Processor channel will be based on RISC microprocessors with 
40 MHz clocks. Additionally, these microprocessors will be supported by instruction 
caches which provide an average of one wait state per instruction fetch and a maximum of 
four wait states for cache misses. Data caches will not be implemented due to the size 
limitations of the SEM-E modules and four wait states will be required for all data accesses. 
The memory available on each SEM-E processor module will be four megabytes of random 
access memory (RAM) and read only memory (ROM), combined. The mix of RAM and 
ROM will be variable in 256 Kbyte increments and will be determined after detailed design 
of the application software.
3-23
Each FTP channel will have two processors. These processors may be allocated 
such that one performs all of the required computation while the other performs all input 
and output or they may each share the computation and I/O burden. As in the engineering 
model implementation each channel can be used in quad-, triplex-, dual-redundant or 
simplex processing sites. The design of the Communicator and Interstage hardware and 
the software supports the use of the same hardware at different levels of redundancy. 
3.5.2.2. Communications Node 
The IC and I/O Networks operate at a 100 MBPS signalling rate. At that signalling 
rate, the information transfer rate is a maximum of 96.4 MBPS if the data frames are 
transferring packets with 4,096 bytes of data. The information transfer rate will be 
correspondingly lower for data frames with fewer bytes in the information field. The 
design of the ALS Communications Node will be based on the engineering model and full-
duplex communication is supported throughout the network. All IC and 1/0 Network 
connections will be made using fiber optic data links for electrical isolation and tolerance to 
electromagnetic interference due to lightning strikes and pyrotechnic devices. 
Each Communications Node will be implemented using a single SEM-E module. 
Each module will provide five ports for connections to the processing sites or I/O devices 
and a Node Controller implemented in a 20,000 gate, 0.8 micron ASIC. 
3.5.2.3. AlPS for ALS Avionics Packaging 
The packaging concept for the AlPS for ALS avionics is the Standard Electronic Module - 
Format E (SEM-E) conduction cooled modules installed in base-plate cooled chassis. 
Module-to-module interconnections are to be made using a motherboard rigidly mounted to 
the chassis and the standard SEM-E 250 contact connector is used on all modules. Figure 
3-17 illustrates the packaging concept for the AlPS for ALS avionics. 
3.5.3. Hardware Failure Rate Projections 
MIL-HDBK-217E provides the failure rate data and method for estimating 
reliability during early design phases [16]. The information needed to apply the method is: 
1) Generic part types including complexity for microelectronics and quantities; 2) Part 
quality levels; 3) Estimate of the number and type of opto electronic components; 4) 
Estimate of the number and type of passive components (resistors and capacitors); 5) 
Estimate of the number of connectors and connector contacts; 6) Estimate of the number of 
other connections (e.g., device solder joints); and 7) Equipment environment. 
3-24
Conduction Cooled 
Module with Circuit Baseplate Mounted and Cooled Chassis with Circular-MIL 
Boards and Component on 
Both Sides. One End and Single r Modules. 
Top Cover Includes EMI/EMC 
Gasket and Environmental 
Gasket. 
Figure 3-17. AlPS for ALS Packaging Concept 
3-25
The expression for generic parts failure rate using this method is as follows: 
XGenenc= NI(XGItQ)1 (3-1) 
where
A<eneric = total generic parts failure rate (failures I 106 hours) 
Ni = quantity of ith generic part 
A43 = generic failure rate for jth generic part (failures / 106 hours) 
ltQ = quality factor for jth generic part 
n = number of different generic parts categories 
In addition to the generic part failure rates estimated above, the failure contributions 
due to the discrete components and connections can also be estimated using expressions 
outlined below. 	 V 
The opto electronic devices required for the IC and I/O networks and the FTP inter-
channel data links have per part failure rates estimated by the following: 
Xp = XbicTnEicQ failures / 106 hours	 (3-2) 
where
Xb = base failure rate (failures / 106 hours) 
ItT = temperature factor 
IVE = environmental factor 
7tQ = quality factor 
Digital integrated circuits and analog components require power supply bypass 
capacitance and ceramic capacitors are typically used for this function. This type of 
capacitor has per part failure rates estimated by the following: 
Xp = ? b1tE1tQ7tCV failures / 106 hours	 (33) 
where
Xb = base failure rate (failures / 10 6 hours) 
nE = environmental factor 
ItQ = quality factor 
ItCV = capacitance factor 
Film resistors are used to set predefined signal levels, specify analog amplifier 
gains and limit currents. They have per part failure rates estimated by the following: 
Xp = XItEItQitR failures / 106 hours	 (34) 
where
Xb = base failure rate (failures / 106 hours) 
ICE = environmental factor
3-26
ltQ = quality factor 
I R = resistance factor 
Connectors (MIL-C-38999 Circular), to be used on the LRUs, have per part failure 
rates estimated by the following: 
Xp = kbnElCPXK failures / 106 hours	 (3..5) 
where
Xb = base failure rate (failures / 106 hours) = 0.02eX 
x = -1592 +,(T+273 
473. 
)536; 
T	 T = operating temperature 
ICE = environmental factor 
tp = number of active contacts in connector 
7tK = mating/unmating factor 
Module connectors, to be used on the SEM-E modules, have per part failure rates 
estimated by the following: 
2Lp = ?.b7tE7tP7tK failures / 106 hours	
.	 (3-6) 
where
Xb = base failure rate (failures / 106
 hours) = 0.21 6eX 
 = -2073.6 + T+273 )4.66 T = x	 423	 operating temperature 
ltE = environmental factor 
icp = number of active contacts in connector 
ltK = mating/unmating factor 
Other connections in the system, such as integrated circuit leads to circuit boards, 
have failure rates estimated by the following: 
p7tE
i=1	 (3-7) 
where
' RE = environmental factor 
Ni
 = number of connections of the ith type 
kbi = base failure rate of the jth type connection 
It-fl = tool type factor for the ith type connection 
nQi = quality factor for the 1th type connection 
The individual failure contributors, estimated using the above expressions, can then 
be summed and the result is the total estimated failure rate. 
3-27
Tables 3-1 through 3-10 list the anticipated parts required for the implementation of 
the AlPS for ALS avionics and their generic failure rates. Included in the list are the 
integrated circuits, resistors used for setting signals to desired states, capacitors used for 
filtering noise from the supply voltages, an estimate of the number of gates required for 
each integrated circuit, and the number of electrical contacts. Tables 1, 3, 5, 7, and 9 are 
for operation while the vehicle is on the launch pad and Tables 2, 4, 6, 8, and 10 are for 
system operation during powered flight, i.e., the boost phase. These data are used in the 
MIL-HDBK-217E defined calculation of the module mean-time-between-failure (MTBF) 
which are also shown in the tables. 
The failure rates of the AIPS/ALS avionics modules are summarized in Tables 3-
11, 3-12, and 3-13 for the three ALS mission phases - launch pad, boost phase, and on-
orbit - respectively. The on-orbit failure rates were obtained by using the space 
environmental factor ltE = 0.9. 
Module: CPJ Environment' Fixed 
Part No.: Part Name: Quantity : Gates: I/O Contacts: Generic Fail Rate Env/Temp Factor: Fail./i 0'6Hrs 
Microprocessor 1 105,00 168 0 280 1.00 0.280 
2 Coprocessor 1 80,000 132 0.2802 1.00 0.280 
3 RAM 4 2,100,000 4 0 0.5416 1.00 2.166 
4 ___ 30 1,100.00 4 0 0.2447 1.00 7.341 
5 CPUASC 1 20,00 200 0.4103 1.00 0.410 
6 Bus Transceivers 10 200 201 0.1627 1.001 1.627 
7 Resistors 80 n/a 21 0.0037 1.001 0.296 
8 Capacitors 40 n'
 0.0170 1:00 0.680 
9 I/O Connector 1 n/a 350 3.17E.1 2.00 6.35E-1 
MTBF: 1	 76445. 
Table 3-1. CPU Module Parts List (Ground Fixed) 
Module: CRJ Environment: Missile Launch 
Part No' Part Name: Quantity ' Gate's. I/O Contacts Generic Fail Rate: Env/Temp Factor: Fail/i 046Hrs: 
I Microprocessor 1 105,000 168  1.2143
 00 1,2143 
2 Coprocessor 1 80,000 132
 1.2143
 .00 1.2143 
3 RAM 4 2,100,000 40  1,2695  00 5.0781 
4
 30 1,100,000 40 0.7060
 .00 21.180 
5 CPUASIC 1 20.000 2001 1.8097
 .00 1.809 
6 Bus Transceivers 10 '200 20 0.1627
 1.001 1.627 
7 Resistors 80 n/a 2 0.0540
 1.00F 4.320 
8 Capacitors 40 n/a 2 0.0100
 .00 0.400 
9 1/O Connector I n/a 350 3.17E-12
 .50 4.76E-1 
MTBF: 271417
Table 3-2. CPU Module Parts List (Missile Launch) 
3-28 
Module: Share Devices
 Environment: Ground Fixed
Part No.: Part Name: Quantity: Gates: I/O Contacts: Generic Fail Rater Env/Temp Factor:
______________ 
Fail./106Hrs 
1 Real-Time Clock 1 1.500 168 0.0750
 .00 0.0750 
2 WatchdopTirner 1 750 13 0.039 0.0391 
3 RAM 4 2.100,000 4 0.5416 .00 2.1665 
4 MaintenanceASlC 1 20,000 200 0.410
 0.4103 
5 Bus Transceivers 10 200 2 0.162  .00 1.6270 
6 Resistors I	 80 n/a 1	 21 0.003
 .00 0.2960 
7 Capacitort 40 We 2 0.017
 .00 0.6800 
8 I/O Connector I n/a 1	 3501 3.17E-1  .00 6.35E-12 
MT 188896.8 
Table 3-3. Shared Devices Module Parts List (Ground Fixed) 
Module: Share Devices Environment Missile Launch 
Part No.: Part Name: Quantity : Gates: I/O Contact" Generic Fail Rats Env/Temp Factor: Fail./1 OA6Hra 
1 Real-Time Clock 1 1,500 168 0.3091
 .00 0.3091 
2 Watchdog Timer 1 750 132 0.1627
 .00 0.1627 
3 RAM 4 2,100,000 40 1.2695  .00 5.0781 
4 MaintenariceASlC 1 20.000 200 1.8097
 .00 1.809 7. 
5 Bus Transceivers 10 200 20 0.2480
 00 2.4800 
6 Resistors 1	 80 n/al 2 0.0540
 1.001 4.3200 
7 Capacitor' 40 n/a 2 0.0100
 .00 0.4000 
8 1/O Connector 1 .	 n/a 350 3.17E-12 .50 4.76E-12 
 MIBF: 686837 
Table 3-4. Shared Devices Module Parts List (Missile Launch) 
Module' COM/INT
 Environment: Ground Fixed 
Part No.: Part Name: Quantity: Gates: I/O Contacts: Generic Fail Rate' Env/Tèmp Factor: Fail/i OA6HrS: 
1 COMASC
 20.000 16 0.4103
 1.00 0.410 
2 INTASIC  20,000 13 0.4103
 1.00 0.410 
3 Oscillator
 50 4 0.0830
 .00 0.166 
4 Bus Transceivers 1 20 2 0.1627
 .00 1.627 
5 Ff0 Transmitter
 8 250 40 0.0698  1.00 0.558 
6 F/OReceivar 81 2,5001 401 1.4704 1 ooT 11.763 
7 Resistors 80 n/a
 0.0037
 1.00 0.296 
8 Capacitors 40 n/a
 0.0170 1.00 0.680 
9 I/O Connector I I	 n/a 350 3.17E-12 2.00 6.35E-1 
 MTBF: 62848.8 
Table 3-5. COM/INT
 Module Parts List (Ground Fixed) 
Module: COM/INT Environment: Missile Launch  
Part No.: Part Name: Quantity' Gates: I/O Contacts: Generic Fail Rate' Env/Temp Factor: Fail./106Hrs 
1 COMASIC 1 20.000 168 1.8097 1.00 1.809 
2 INTASIC 1 20,000 132 1.8097 1.00 1.809 
3 Oscillator 2 50  40 1.1000 1.00 2.2000 
4 Bus Transceivers 10 200  201 0.2480 1.00 2.4800 
5 Ff0 Transmitter 8 250 40 1.1552 1.00 9.2416 
6 Ff0 Receiver 8 2.500 401 18.3091 1.00 146.472 
7 Resistors 80 n/a 2 0.0540 1.00 4,3200 
8 Capacitors 40 n/a 2 0.0100
 1.00 0.4000 
9 I/OConnector I n/a 350 3.17E-12 1.50 4.76E-12 
MTBF: 5926.5
Table 3-6. COM/INT
 Module Parts List (Missile Launch) 
3-29 
Module: NIS - Environment: Ground Fixed 
Part No.: Part Name: Quantity : Gatec I/O Contacts: Generic Fail Rate Env/Temp Factor: Fail./lOA6Hrs: 
1 IEEE-802.5 ASIC  30,000 168 0.4103 1.00 0.410 
2 F/O Transmitter , 250 40 0.0698  .00 0.06 
3 F/O Receiver  2,500 40 1.4704  00 1.470 
4 RAM e 2.100,000 40 0.5416  .00 4.333 
5 Bus Transceivers 10 200 20 0.1627  .00 1.627 
6 Resistors 1	 80 n/a 21 0.00371 1.001 0.296 
7 Capacitors 40 n/a 2 0.0170 .00 0.680 
8 I/O Connector I n/a 350 3.17E-12 2.00 6.35E-1 
MTBF:l 112530.4 
Table 3-7. NIS Module Parts List (Ground Fixed) 
Module: NIS  Environment: Missile Launch 
Part No.: Part Name: Quantity Gates: I/O Contacts: Generic Fail Rater Env/Temp Factor: Fail./1 0'6Hrs: 
1 IEEE-802.5 ASIC 1 30,000 168 1.8097  .00 1.8097 
2 F/O Transmitter 1 250 40 1.1552  .00 1.1552 
3 F/O Receiver 1 2,500 40 18.3091  00 18.3091 
4 RAM 8 2.100,000 40 1.2695  00 10.1561 
5 Bus Transceivers 1 0 200 20 0.2480  00 2.4800 
6 Resistors 80 n/a 2 0.0540 1.001 4.3200 
7 Capacitors 40 n/a 2 0.0100 -	 .00 0.4000 
8 I/O Connector 1 n/a 350 3.17E-12 .50 4.76E-12 
MTBF: 1	 25886.5 
Table 3-8. ICIS/IOS Module Parts List (Missile Launch) 
Module U'J  Environment: Ground Fixed 
Part No.: Part Name: Quantity : Gates. I/O Contacts Generic Fail Rate Env/Temp Factor: Fail./10'6Hrs: 
1 IEEE-802.5 ASIC 1 30.00 168 0.4103  00 0.410 
2 CNASIC 1 20,000 168 0.4103  .00 0.410 
3 RAM 1 2,100,000 40 0.5416 .00 0.541 
4 FIQM 1 1,100,000 40 0.2447  00 0.2447 
5 Oscillator 2 50 40 0.0830  .00 0.166 
6 F/O Transmitter 5 250 40 0.0698  .00 0.349 
7 F/O Receiver 5 2.500 40 1.4704  00 7.35 
8 Resistors 80 n/a 2 0.0037  .00 0.296 
9 Capacitors 40 n/a 2 0.0170 1.00 0.680 
10 I/O Connector 20 n/a 20 3.17E-1 2 2.00 1.27E-1 
MTBF: 95694. 
Table 3-9. CN Module Parts List (Ground Fixed) 
Module: CN Environment: Missile Launch 
Part No.: Part Name: Quantity Gates. I/O Contacts Generic Fail Rate: Env/Temp Factor: Fail./1 061-Irs 
I -	 IEEE-802.5 ASIC 1 30,000 168 1.8097 1.00 1.8097 
2 CNASC 1 20,00 168 1.809 1.00 1.8097 
3 RAM 1 2.100,00 40 1.269  100 1.2695 
4  1 1,100,00 40 0.706  .00
_
0.7060 
5 Oscillator 2 50 40 1.100  .00 2.2000 
6 F/OTransmitter 5 250 40 1.1552  .00 5.7760 
7 F/OReceiver 5 2.500 40 18.3091  .00 91.545 
8 Resistors 80 n/a 2 0.0540  .00 4.3200 
9 Capacitors 40 n/a 2 0.0100 .00
_
0.4000 
10 I/O Connector 20 n/a 20 E . 3.1712 1.50
_
9.52E.11 
MTBF: _9104.4
Table 3-10. CN Module Parts List (Missile Launch) 
3-30 
Channel Module Complement: Qty: Fail. Rate: Aggregate: 
CPU Module 2 1.31 E-05 2.62E-05 
Shared Devices Module 1 5.29E-06 5.29E-06 
COM/INT Module 1 1.59E-05 1.59E-05 
NIS Module 21 8.89E-06 1.78E-05  
Channel Failure Rate: 6.51 E-05 
Channel MTBF: 15,351 .4 
Communications Node: Qty: Fail. Rate: _Aqregate: 
CN Module 1 1.04E-05 1.04E-05 
CN Failure Rate: 1.04E-05 
CNMTBF: 95,694.5 
Table 3-11. Channel and CN Failure Rates (Ground Fixed) 
Channel Module Complement Qty Fail. Rate: Aggregate 
CPU Module 2 _3.68E-05 _7.37E-05 
Shared Devices Module 1 1.46E-05 1.46E-05 
COM/INT Module 1 1.69E-04 1.69E-04 
NIS Module 21 3.86E-05 7.73E-05  
Channel Failure Rate: 3.34E-04 
Channel MTBF 29991.9. 
Communications Node Qty 'Fail. Rate: _Aggregate 
CN Module 1 1.10E-04 _1.1OE-04 
CN Failure Rate: _1.10E-04 
CN MTBF: 9,104.4 
Table 342. Channel and CN Failure Rates (Missile Launch/Boost Phase) 
Channel Module Complement: Qty: Fail. Rate: Aggregate: 
CPU Module 2 2.54E-6 5.1 OE-6 
Shared Devices Module 1 1.01 E-6 1.01 E-6 
COM/INT Module 1 1.17E-5 1.17E-5 
NIS Module 21 2.67E-6 5.35E-6  
Channel Failure Rate: 2.31 E-5 
Channel MTBF: 43,290 
Communications Node: Qty: Fail. Rate: Aggregate: 
CN Module 1 7.6E-6 7.6E-6 
CN Failure Rate: 7.6E-6 
CN MTBF: 131,529
Table 3-13. Channel and CN Failure Rates (On-orbit) 
3-31 
3.5.4. Performance Projections of AlPS for ALS Building Blocks 
This section highlights some of the important performance projections of the AlPS 
building blocks for ALS. A complete projection of all the performance metrics is beyond 
the scope of this study. However, the performance summary given in this section is 
adequate to do a preliminary definition of the ALS architecture. The performance has been 
projected using several sources of information. The details of the expected microprocessor 
performance in the 1992 time frame are provided in the hardware technology survey and 
projections [2]. A complete set of metrics that are considered relevant for the ALS 
applications and the empirical performance data for a subset of these figures of merit are 
described in [1, 8]. The empirical data collected on the current versions of the AlPS 
hardware and software [1, 8] forms the basis which was used with the estimated 
parameters of the AIPS/ALS flight system, described earlier in Section 3.5.2, to project the 
performance of the AIPS/ALS flight system. 
The AIPS/ALS performance is partitioned and summarized along the hardware 
building blocks. Table 3-14 is a summary of the performance projections of the FTP 
hardware building block and the Local System Services software building block. (The 
details of the Local System Services software are provided in [9]). Table 3-15 is a. 
summary of the performance projections of the 1/0 network and lOS hardware building 
blocks and the I/O System Services software building block. (The details of the- I/O 
System Services software are provided in [10] and [11]). Table 3-16 is a summary of the 
performance projections of the Inter-Computer (IC) network and ICIS hardware building 
blocks and the IC Communication Services software building block. (The details of the IC 
Communication Services software and the ICIS and IC network hardware are provided in 
[12].) The table presents the time for a site to site communication between a source 
application task executing on one FTP sending a message to a sink application task 
executing on a different FTP. It is the time from when the source application task calls the 
SEND—OUTPUT routine with a message until the sink application has the message 
available. The total projected time for the AIPS/ALS is 1.278 milliseconds. 
3-32
1. FTP THROUGHPUT 
1.1 Raw CPU Throughput 
15 MIPS (DAIS Mix) Per Processor 
30 MIPS Per FTP 
1.2 Total Overheads 
1.2.1 FTP Redundancy Management 
CP 1% of Throughput 
lOP 1% of Throughput 
1.2.2 Cross-Channel Synchronization 
CP 5% of Throughput 
lOP 5% of Throughput 
1.2.3 IOP-CP Contention 
CP 5% of Throughput 
lOP 5% of Throughput 
1.2.4 1/0 and IC Network Management 
CP 0 
lOP 10% of Throughput 
1.3 Useful FTP Throughput 
CP 89% or 13.35 MIPS 
lOP 79% or 11.85 MIPS 
FTP 25.2 MIPS 
2. ADA RUN TIME SYSTEM OVERHEADS 
2.1 Ada Run Time System on CP or lOP 
2.1.1 Timer Dispatch	 432 instructions (27 ps) 
2.1.2 Local Event	 196 instructions (12 ps) 
2.1.3 Simple Context Switch	 100 instructions (6 ps) 
2.2 IOP-CP Communication 
2.2.1 Global Event 
Source CPU (CP or lOP) 	 132 instructions (8 ps) 
Sink CPU (lOP or CP)	 240 instructions (15 ps) 
Table 3-14. AIPS/ALS FTP and Local System Services Projected 
Performance 
3-33
3. REDUNDANCY MANAGEMENT 
3.1. Permanent FDIR (Fault Detection, Isolation and Reconfiguration) 
3. 1.1	 No Fault Conditions 
CP 736 instructions (46 p.$) 
lOP 680 instructions (42 p.$) 
3.1.2 Data Exchange Network Fault 
CP 2024 instructions (126 p.$) 
lOP 120 instructions (75 ps) 
3.1.3	 Unsynchronized Channel 
CP 1536 instructions (96 ps) 
lOP 430 instructions (270 ps) 
3.2. Transient FDIR 
3.2.1	 No Fault Conditions 
CP 20 instructions (1.25 ps) 
lOP 56 instructions (3.5 ps) 
3.1.2 Data Exchange Network Fault 
CP 464 instructions(64 ps) 
lOP 64 instructions(4 ps) 
3.1.3	 Unsynchronized Channel 
CP 728 instructions(45.5 ps) 
TOP 64 instructions(4 ps) 
3.3 Average Fault Detection, Isolation and Reconfiguration Time 
lOmsec 
3.4 Inter-Channel Data Exchange 
3.4.1 Fault Tolerant Clock Frequency 	 2 MHz 
3.4.2 Data Exchange Bandwidth 4 Mwords/sec or 64 Mbits/sec
Table 3-14. AIPS/ALS FTP and Local System Services Projected
Performance (cont.) 
3-34 
Raw I/O Network Bandwidth = I 00  Mbits/Sec 
1. COMMUNICATION MANAGEMENT OVERHEADS TOP 
1.1	 1/0 Processing 
1.1.1 10 Transaction Chain 7040 instructions (440 .ts) 
1. 1.2 8 Transaction Chain 5600 instructions (350 p.$) 
1. 1.3 2 Transaction Chain 3360 instructions (210 i.ts) 
2. REDUNDANCY MANAGEMENT OVERHEADS 
2.1.	 Network FDIR 
2.1.1 Failed Channel 929 instructions (58 p.$) 
2.1.2 Failed lOS 14800 instructions (925 Ls) 
2.1.3 Failed Link (Leaf Node) 24134 instructions (1,508 p.$) 
2.1.4 Failed Leaf Node 35344 instructions (2,208 p.$) 
2.1.5 Failed Link (Branch) 24664 instructions (1,541 p.$) 
2.1.6 Failed Branch Node 46264 instructions (2,892 ps) 
2.2.	 Network Growth 
2.2.1 Full Diagnostics 473600 instructions (29,600 j.Ls) 
2.2.2 No Diagnostics 81600 instructions (5,100 l.Ls) 
2.2.3 Single Chain 19366 instructions (1,200 Rs) 
Table 3-15. AIPS/ALS I/O Network and I/O System Services Projected

Performance 
Raw IC Network Bandwidth = 100 Mbits/Sec 
Source VFP SEND—OUTPUT 493 instructions (30.8 xs) 
Source FFP Set Event 187 instructions (11.67 p.$) 
Source FTP MSR Task 760 instructions (47.5 .is) 
IC Net Time on Network 67	 .ts 
Sink FTP Ave Time for Polling for msg
	 1000 J.ts 
Sink FTP ICIS RM 1293 instructions (80.8 p.$) 
Sink FTP Context Switch 120 instructions (7.5 ps) 
Sink FTP MSR Task 426 instructions (26.7 ps) 
Sink FTP GET INPUT 102 instructions (6.67 ps) 
Total Task-to-Task Communication Time on IC Network = 1.278 msec
Table 3-16. AIPS/ALS IC Network and IC Communication Services
Projected Performance 
3-35 
4.0 PRELIMINARY ALS AVIONICS ARCHITECTURE 
As discussed in Section 1 on the design for validation methodology, AlPS for ALS 
configuration (s) are defined using as inputs the ALPS architectural rules, guidelines and 
attributes, the projected reliability, performance, physical characteristics and other attributes 
of the building blocks, and the ALS avionics requirements. The ALS avionics 
requirements were described in Section 2 and the performance and reliability parameters of 
the ALPS hardware and software building blocks, projected in the ALS time frame, have 
been summarized in Section 3. 
The process of matching the avionics requirements with the building block 
capabilities is a multidimensional problem. However, it can be simplified by decomposing 
the requirements into two orthogonal sets each of which can be mapped independently of 
the other as a first order approximation and each of which determines a different aspect of 
the architecture. The performance related ALS requirements such as throughput, memory, 
transport lag, input/output latencies, etc. determine the virtual avionics architecture. The 
reliability related ALS requirements such as probability of mission success, launch 
availability, launch pad maintenance, function criticality, etc. determine the physical 
avionics architecture. 
A preliminary definition of the ALPS for ALS avionics architecture has been carried 
out using the architecture synthesis process described in Section 1.2. Sections 4.1 and 4.2 
describe the preliminary virtual and the physical avionics architecture for ALS, 
respectively. Using the projected performance and reliability parameters of the building 
blocks, analytical models of the ALS architecture were solved to predict the avionics 
reliability and the availability for the ALS mission scenario. These results are described in 
Section 4.3. Section 4.4 concludes with some thoughts on future work necessary to 
complete the ALS architecture synthesis. 
4.1	 Virtual Architecture 
The ALS functions that require the highest throughput are IMU and GPS 
processing, Kalman filter, adaptive guidance and control, and propulsion control. Using 
the Martin Marietta supplied processing requirements, all the non-propulsion functions 
require a total of about 8.8 MIPS throughput (including margins) with a peak 
instantaneous throughput of about 3 MIPS to perform part of the Navigation/IMU 
Processing function. The total is the sum of throughput requirements for all tasks. The 
peak is the throughput required to perform an indivisible task. If a processor is not fast 
enough to execute all the tasks then the workload can be assigned to a number of parallel 
processors. However, a processor with at least the peak throughput is required to run the 
highest throughput indivisible task. The most demanding non-propulsion peak throughput 
requirement of 3 MIPS for the ALS results from the Bending Processing task, which is 
4-1 
PeAm	
PRECEDING PAGE BLANK NOT FILMED
part of the Nav/IMU function. The Bending Processing task executes only 300 
instructions per iteration. At the 100 Hz iteration rate of this task, the average margined 
throughput requirement is about 0.03 MIPS. However, in order to meet the processing lag 
constraint of 0.1 msec, the peak throughput requirement rises to 3 MIPS. Since this is 
considerably below the expected processor throughput for the ALS time frame, this 
requirement does not pose a problem. However, if such a fast processor were not 
available, one would need to analyze the Bending Processing task further to see if it can be 
parallelized into two or more parts. 
Since all of the non-propulsion functions taken together require less than the total 
useful throughput projected to be available in an ALS Fault Tolerant Processor, all of the 
non-propulsion functions can be allocated to a single FTP. We will call this the core FTP. 
If the total throughput requirement had exceeded the capacity of a single FTP, some criteria 
such as function criticality, interfunction communication rates, etc. would have had to be 
used to partition the ALS functions into groups and allocate them to different FTPs. The 
core FTP will have a growth margin of 16.4 MIPS (25.2 MIPS useful FTP throughput 
from Table 3-14 minus 8.8 MIPS throughput required for all non-propulsion functions 
from Section 2) or 186 per cent. This is in addition to the throughput margins already built 
into the processing requirements at the task level as described in Section 2. 
Another constraint on allocating functions to a single FTP is the FTP data exchange 
bandwidth. All the sensor inputs must be made congruent using the data exchange 
hardware before they can be used by the applications functions. All the outputs are also 
usually voted before they are sent Out to actuators. The output voting also uses the same 
data exchange hardware. The total I/O bandwidth requirement for all the non-propulsion 
functions is estimated to be 11.2 Mbits/sec. This is well within the projected bandwidth of 
64 Mbits/sec of the FTP data exchange mechanism. 
The ALS avionics functions allocated to the core FTP are listed below. 
CORE FTP Functions 
1 Central Control & Processing 
2 Winds Ahead Determination 
3 Vehicle Power System Management 
4 Steering & Staging Control 
5 Sensor Processing 
7 Command & Telemetry Processing 
8 Range Safety & Destruct 
9 Programmable Payload
4-2
Each ALS engine requires a controller which must be capable of providing a useful 
total throughput of about 4.8 MIPS with a peak of about 6.4 MIPS. Since the controller is 
to be colocated with each engine, a dedicated FTP is necessary to host propulsion control 
functions for each ALS engine. A single FTP per engine would provide a growth margin 
of 20.4 MIPS or 400 per cent. If the colocation of the engine controller with the engine 
were not a high level requirement, a single FFP could be allocated to control a group of 4 
engines since it will have the throughput and the data exchange bandwidth necessary to 
control 4 engines simultaneously. This would reduce the total avionics hardware 
substantially and would still leave a growth margin of 6 MIPS. However, it may 
complicate the logistics associated with assembling and testing each individual engine. The 
propulsion controller I/O bandwidth requirement of less than 0.1 Mbits/sec is quite modest 
and is easily accommodated by the FTP. 
The virtual intercomputer communication architecture for ALS is quite 
straightforward. It will be a virtual bus that interconnects the core FTP to all the propulsion 
control FTPs. The bus bandwidth requirement is quite modest since all the non-propulsion 
functions are colocated in a single processing site. The communication bandwidth between 
the core FTP and all the propulsion FTPs in the ALS launch phase consists of propulsion 
commands going to the engine FTPs and the engine status data flowing back to the core 
FTP. The requirement for this is expected to be about 10 Mbits/sec which is well within 
the projected intercomputer bandwidth of 100 Mbits/sec for the ALS IC network. 
The virtual I/O communication architecture for the ALS would Consist of a number 
of parallel, virtual buses that interconnect ALS sensors and actuators to the FTPs. The 
sensors and actuators in each engine will be connected to the FTP controlling that engine on 
a local bus. The core FTP will interface with the IMU, GPS, and all the other sensors 
required to perform the non-propulsion functions. The number of parallel I/O buses 
required in the core FTP and the propulsion FFPs is determined by the number of sensors 
and the frequency of their access. Since the total I/O bandwidth required for all the core 
FTP functions is 11.2 Mbits/sec, a single virtual I/O bus should suffice. This preliminary 
I/O bus definition may be changed if the detailed performance modeling shows that some 
performance criterion such as the transport lag can not be satisfied with a single bus. The 
I/O bandwidth required for the propulsion FTPs is even less, only 0.1 Mbit/sec and can be 
easily satisfied with a single virtual I/O bus. 
The vehicle health monitoring functions, their processing requirements or the 
number of sensors and the information flow associated with the sensors were not made 
available to CSDL during the course of this study. Therefore no conclusions can be drawn 
at this time about the virtual architecture that is required to read these sensors and perform 
the vehicle health monitoring functions. However, it is possible that the number of these 
4-3
sensors and the frequency with which they are read can easily exceed not only the available 
bandwidth of I/O buses but also the capability of the processors to process the enormous 
amount of collected data in real time. If, on the other hand, the majority of the sensors are 
to be used to collect vehicle health data for launch pad monitoring (which would relax the 
real time processing constraint) and/or for post-flight analysis, then the problem can be 
dealt with quite effectively by providing a separate system which is dedicated to sensor 
collection (but no real time analysis) and telemetry. 
4.2 Physical Architecture 
The architectural parameters that determine the AlPS physical organization for the 
ALS avionics include: redundancy level of FFPs; redundancy levels of sensors, actuators 
and other I/O devices; cross-strapping of I/O devices to channels of FTPs and redundancy 
level of interfaces; trade-offs between FTP redundancy (triplex or quad) versus system 
redundancy (N or N+1 FTPs) for availability; redundancy level of intercomputer and I/O 
networks; and physical topologies of networks. 
Since all ALS functions are flight-critical and cannot be suspended for more than a 
few milliseconds, a fault-masking computational architecture is required. Therefore, all 
FTPs need to be at least triplex at the launch time. Earlier reliability modeling of the FTP 
done for the ALS mission has shown that a triplex FTP has sufficient reliability to meet the 
ALS requirements for short mission durations (of the order of several hours). Also, due to 
the short mission time, no in-flight reconfigurations of the VFPs will be required to meet 
the reliability requirement. That is, after a channel fails in-flight in an FTP, the outputs of 
the failed channel to actuators will be disabled by the Monitor/Interlock circuitry. 
However, no effort would be made to recover the failed channel and retry it to ascertain if 
the channel failure was caused by a transient fault. However, for longer ALS missions, of 
the order of tens of hours, it may become necessary to change this policy. In the present 
AlPS FTP design, the realignment of the state of the faulted channel to the state of the good 
channels is performed in software. Although this process is relatively fast for small 
amounts of volatile memory, it can become unacceptably long for certain combinations of 
large RAM and high iteration tasks that cannot be suspended. A new, hardware 
implemented, channel realignment scheme has been developed to increase the speed of 
realignment and to do it in background without suspending any applications tasks [13,14]. 
This will allow the ALS FTPs to recover from transient faults and continue to provide the 
fault masking capability for longer ALS missions. The projected reliability of the ALS FTP 
as a function of mission time, computed using the module failure rates projected for the 
ALS time frame, is given in the next subsection. 
4-4
The intercomputer bus will be physically implemented as a redundant network. A 
triplex redundancy level is required for dynamic masking of intercomputer communication 
faults. However, it may be possible to relax this requirement somewhat and use only dual 
redundant networks if sufficient progress can be made in the use of authenticated protocols. 
The physical topology of the IC network will be determined by the constraints imposed on 
the physical location of the core FTP and the propulsion control FTPs. 
The I/O buses will also be implemented as redundant networks. Because of the 
inherent redundancy in the sensors and actuators, the I/O networks need not be triplicated. 
Dual redundant networks with a few spare links in each network are expected to meet the 
reliability and availability requirements of ALS. The I/O network topologies will be 
determined by the physical location of sensors and actuators on the launch vehicle. 
The I/O and IC networks will be operated as static, non-reconfigurable buses in-
flight for short missions. For mission lengths of tens of hours, it may become necessary to 
make them reconfigurable. The flight system performance projections, summarized in 
Section 3.5, show that the in-flight reconfigurations of networks can be accomplished 
without suspending any of the ALS applications tasks. The reconfigurability of networks 
is also intended to tolerate various link and node failures during the relatively long 1 to 2 
weeks on the launch pad. This could obviate the launch pad repairs and launch delays 
resulting in a reduced overall cost of the ALS. 
The core FTP would also be provided with a spare channel to obviate launch pad 
repairs. Thus, the core FTP would be quad redundant rather than triplex assuring a high 
probability of having at least a triplex level of redundancy at the scheduled launch time. 
The FTP will be operated as a quad unit with all four channels active and performing the 
same tasks in synchronism. This provides a better fault coverage for the fourth channel in 
comparison to a triplex with a standby spare channel. Since the number of FTPs in the 
candidate ALS architecture is small (only 1 is required to perform all the non-propulsion 
functions), this strategy of providing a spare channel in the FTP is quite cost effective. 
However, if the ALS requirements were to increase or the projected FTP throughput were 
to be significantly lower, more FFPs would be required to perform the ALS functions. A 
more cost effective strategy in this case may be to add a spare triplex FTP and provide the 
sensor and actuator cross-strapping necessary to reassign functions to processing sites. 
Use of global I/O networks enables a straightforward and cost effective sharing of sensors 
and actuators among a number of FTPs. The next subsection provides numerical modeling 
results on the availability of the ALS core avionics. 
The sparing of the propulsion control FTPs may or may not be necessary 
depending upon the philosophy of launching the ALS with fewer than a complete 
4-5
complement of engines. The baseline ALS vehicle design supplied to CSDL has up to 17 
engines. If it is necessary to have all the engines operating prior to launch (which is the 
current launch philosophy), it will be necessary to make the engine control FTPs quad 
redundant as well. However, if a new philosophy of launch with failed components is 
applied to the ALS, and this can be more cost effective and safe if designed-in from the 
outset, then it is no longer necessary to make sure that every engine controller has a fault 
masking capability at the launch time. 
4.3. Physical Characteristics 
Each AIPS/ALS Fault Tolerant Processor channel will consist of six SEM-E 
modules -- two CPU modules, one Shared Devices module, one Communicator and 
Interstage module, and two Network Interface Sequencer modules. The SEM-E modules 
are 5.88 in. x 6.68 in. x 0.6 in. and weigh approximately 1 lb. each. Power supplies will 
also meet the SEM-E form-factor and will occupy two additional module slots in each 
channel. Power dissipation is estimated to be 8 W for each CPU module, 2 W for the 
Shared Devices module, 3 W for the Communicator and Interstage module, and 6 W for 
the NIS module. Power dissipation for each channel, therefore, is approximately 41 W 
including the power dissipation for 80% efficient power supplies. Each channel will be 
packaged in a single chassis and the volume of the chassis is approximately 902 cu. in. (8 
in. x 11.5 in. x 9.8 in) or .52 Cu. ft. The weight of each channel, including the power 
supplies and chassis, will be approximately 27 lb. 
The projected physical characteristics of an ALS triplex and a quadruplex FFP are 
summarized in Table 4-1. 
Triplex FTP Quad FTP 
Power	 124W 165W 
Weight	 81 lb. 108 lb. 
Volume	 1.5 Cu. ft. 2 Cu. ft.
Table 4-1. AIPS/ALS Fault Tolerant Processor Projected Physical 
Characteristics 
It is illustrative to compare the physical characteristics of the AIPS/ALS FTPs to 
computers on-board current launch vehicles. Two examples of current generation of space 
launch vehicles are the space shuttle and the Titan. 
Each of the three Space Shuttle Main Engines (SSME) has a dedicated engine 
control computer built by Honeywell. The current version, called the block II, is a dual 
redundant processor that controls a number of main propellent valves, solenoids and spark 
4-6 
• igniters. The inputs to the controller include a number of pressure and temperature 
sensors. This computer weighs 200 lbs and consumes 490 Watts in the standby mode. 
During flight, the controller power consumption is 600 watts. The controller dimensions 
are 23.5" x 14.5" x 17.0" or about 3.35 cubic feet. 
The Titan series of unmanned launch vehicles use Magic 352 computer produced by 
the Delco division of General Motors. This guidance and control computer is single string, 
i.e., has no redundancy. It weighs 68 ibs, consumes 220 watts of power and occupies 1.5 
cubic feet. 
It can thus be seen that the physical characteristics of the AIPS/ALS Fault Tolerant 
Processors are well within the realistic constraints that might be imposed on the launch 
vehicle avionics. 
4.4	 Reliability and Availability Projections 
The reliability and availability models of the AlPS building blocks are described in 
detail in Section 4 of the accompanying report: Advanced Information Processing System: 
Design and Validation Knowledgebase [1]. These models were executed using the failure 
rates and recovery rates for a 1992-technology ALS F1'P summarized earlier in this report 
in Section 3.5. The availability and the reliability of the AlPS for ALS avionics architecture 
are presented in Tables 4-2 and 4-3. Table 4-2 lists these results for the case where the 
core ALS FTP and all the propulsion FTPs are quadruply redundant. The results for the 
ALS configuration consisting of all triplex FTPs are summarized in Table 4-3. 
Results have been tabulated for the baseline AlPS for ALS avionics architecture as 
well as several variations on the baseline. The baseline architecture, as described earlier, 
consists of a core FT  that performs all the non-propulsion functions and a dedicated 
propulsion control FTP for each engine. The baseline vehicle design from Martin Marietta 
consists of 17 engines. Therefore the baseline architecture consists of 18 FTPs. The 
number of FFPs in the avionics architecture is denoted by the term Critical Minimum 
Complement. Results have been tabulated for a CMC of 1, 2, 3, and 4, in addition to the 
baseline case of 18 FTPs. This allows one to examine the avionics availability and 
reliability only for a subset of functions such as the non-propulsion functions which require 
only 1 FTP to be operational. Furthermore, if a group of engines is allocated to a single 
FTP, as suggested in the earlier discussion on architecture synthesis, or the number of 
engines is fewer than 17, then new RMA numbers can be quickly obtained from these 
tables.
The tables present the per cent Launch Availability of the avionics architecture, for a 
specified complement of FTPs, in column 1. The Launch Availability is defined to be the 
probability of the CMC of FTPs being in a fault masking state at the end of a week of 
4-7
operation on the launch pad. An FTP is capable of masking faults if it has at least three 
operational channels. The next column gives the probability of failure of the ALS avionics 
during the powered flight or the boost phase. A failure of any one FTP out of the CMC is 
considered a total avionics failure. Finally, the last column gives the same failure 
probability for the on-orbit phase of ALS. 
The avionics reliability for the whole ALS mission, i.e., the boost phase plus on-
orbit phase, R(ALS), can be calculated from the failure probabilities of the two mission 
phases, PF(Boost) and PF(Orbit), using the following equation: 
R(ALS) = (1 -PF(Boost)) * (1 -PF(Orbit) }. 
However, for small failure probabilities, as is the case for the ALS mission, the 
mission failure probability is approximately just the sum of the failure probabilities for each 
mission phase. That is, 
PF(ALS) = PF(Boost) + PF(Orbit). 
The RMA models and their underlying assumptions are described in detail in 
Section 4 of [1]. However, it is useful to briefly recapitulate some of the important 
assumptions here. These are as follows. 
• The launch pad operations last 200 hours (approximately 1 week), the boost 
phase lasts 0.2 hours (12 minutes), and the on-orbit phase lasts 50 hours. 
• Failure of any single module in an FTP channel results in the loss of that 
channel. 
• In the boost phase, no recovery is attempted from transient faults, i.e., 
transients are treated as permanent faults. 
• On-orbit, transient failure recovery is performed by the FTP in the background 
without suspending applications tasks. 
• Transient faults occur 10 times more frequently than permanent faults. 
• Permanent faults occur at a rate dependent on the mission phase. The 
permanent failure rates for the AIPS/ALS modules are as summarized in Tables 
3-11, 3-12, and 3-13. 
• The average recovery time from a fault is assumed to be 20 milliseconds. 
• Quad and triplex fault recovery coverages are assumed to be 1.0. Duplex fault 
coverage is assumed to be 0.9.
4-8
Critical Minimum Launch Availability Failure Probability Failure Probability 
Complement (Boost Phase) (On-orbit) 
1 99.88% 8.90* 10-9 5.33*10.7 
2 99.76% 1.78*10-8 1.07*10-6 
3 99.64% 2.67*10-8 1.60* 10-6 
4 
•
99.52% 
•
3.56* 10-8 
•
2.13*106 
. 
• . • . 
18 97.86% 1.60* 10-7 9.59* 10-6
Table 4-2. Availability and Reliability of ALS Avionics (Quad FTPs) 
The following conclusions can be drawn from the analytical evaluation of the AlPS 
for ALS avionics architecture. For the baseline architecture, consisting of 18 FTPs, it is 
necessary to provide a quadruple level of redundancy, in order to meet the 95% launch 
availability requirement. The baseline architecture is expected to have 97.86% availability. 
This configuration will also meet the mission reliability goals. Specifically, the mission 
probability of failure is expected to be 1.6* 10-7 for the boost phase, 9.59* 10-6 for an 
extended on-orbit phase (50 hours), and a total of 9•75* 10-6 for the whole ALS mission. 
This exceeds the goal of 10 5 just slightly. 
Critical Minimum Launch Availability	 Failure Probability	 Failure Probability 
Complement	 (Boost Phase)	 (On-orbit) 
1	 96.20%	 1.67* 10-7	 1.01*10-5 
2	 92.50%	 3.34* 10-7	 2.02* 1075 
3	 89.0%	 5.01*10-7	 303*10-5 
4	 85.6%	 6.68*10-7	 4.04* 10-5 
•	 •	 •	
S 
•	 •	 •	
• 
18	 49.8%	 3.01 10-6	 1.82*10-4 
Table 4-3. Availability and Reliability of ALS Avionics (Triplex FTPs) 
The contributions to the unavailability and unreliability come predominantly from 
the propulsion avionics since 17 out of the 18 FTPs are for engine control. The availability 
of the non-propulsion avionics, which consist of just 1 FTP, is 99.88% for a quad and 
96.2% for a triplex FTP. Similarly, the mission loss probability, attributable to non-
4-9 
propulsion avionics, is 5.42* 10 7 for the quad FTP. Evidently, one needs to reexamine 
the requirement of dedicating a controller to each engine. If, for example, an FFP was 
configured to control 4 engines, which it is capable of doing based on the performance 
projections, only 4 FTPs will be necessary to control 16 engines. This would reduce the 
probability of mission failure due to a failure of propulsion avionics to 2 . 16*1 .6 using 
quad FTPs. The launch availability would improve to 99.52%. 
The ALS vehicle and engine designers should seriously examine the option of 
integrating engine controllers outside the engine and with the core avionics. This would 
not only improve the overall ALS reliability and availability, as demonstrated above, but 
would also result in reduced weight, volume, power and cost. 
4.5 Architecture Summary and Conclusions 
A preliminary AlPS-based fault tolerant computer system architecture has been 
configured to meet the ALS performance, reliability, and availability requirements. A 
single quadruply redundant AlPS Fault Tolerant Processor, the core ViP, will perform all 
the non-propulsion functions required in the ALS. Additionally, there will be a propulsion 
control FTP dedicated to each engine. The core FTP will access the guidance, control, 
navigation and other sensors and actuators on one redundant 1/0 network. Each of the 
engine control FTPs will access engine sensors and actuators on a dedicated I/O network. 
The core FTP and all of the engine control FTPs will be connected by a fault-masking triply 
redundant intercomputer network. 
For short ALS missions, lasting an hour or less, it will not be necessary to 
reconfigure the FTPs or the I/O and IC networks. Redundant hardware would provide 
sufficient fault masking capability to meet the ALS reliability requirement. However, for 
longer ALS missions, lasting 1 hour to 48 hours, it will be necessary to re-integrate FTP 
channels affected by transient faults and to reconfigure the 110 and IC networks. The 
performance projections show that these in-flight reconfigurations can be accomplished 
without suspending any of the ALS applications tasks. The reconfigurability of the 
networks is also intended to obviate expensive launch pad repairs. 
The AlPS for ALS architecture defined here is preliminary in nature but shows that 
the ALS performance and reliability requirements can be met by the AlPS hardware and 
software building blocks that are built using the state-of-the-art technology available in the 
1992-93 time frame. The level of detail in the architecture definition reflects the level of 
detail available in the ALS requirements. As the avionics requirements are refined, the 
architecture can also be refined as well as defined in greater detail with the help of analysis 
and simulation tools. For example, the functions in the core FTP need to be allocated to the 
4-10
computational processor and the 1/0 processor. This requires a more detailed enumeration 
of interfunction communication requirements and 1/0 communication requirements. Also, 
no effort was expended on defining the detailed I/O architecture. This requires as inputs 
the sensor details such as the number and type of sensors, their failure rates, and so on. 
This information can be used to define redundancy levels of sensors and allocate sensors to 
different redundant layers of the I/O network. 
Several variations on the baseline architecture presented here are also possible and 
should be modeled and analyzed. These include allocating several engines to one 
propulsion control FTP, investigating the effects of launch with failures, and using 
authentication for the I/O and IC networks.
4-Il
5.0 IMPACT OF AIPS/ALS ARCHITECTURE ON ALS COST 
5. 1 Introduction 
The main motivation of the National Aeronautics and Space Administration and the 
Department of Defense in sponsoring its Advanced Launch System program is to realize a 
substantial reduction in the recurring launch costs over the present launch systems. The 
avionics architecture selected for the ALS will have an impact on the recurring launch costs. 
However, the impact is not limited to the cost of the avionics system itself or its cost in 
terms of the weight or physical displacement it will add to the vehicle. Being mission 
critical, the reliability of the avionics suite will directly influence vehicle failure. 
Obviously, a failure of the vehicle can incur very sizable costs. A failure while awaiting 
launch on the launch pad may require repair and incur the cost of the repair and the cost 
resulting from interrupting and delaying the launch. A failure during launch can cause a 
loss of the vehicle and of the payload it is carrying. If fault tolerance is also an attribute of 
the avionics suite, there is the possibility of exploiting this trait operationally to reduce 
costs. Therefore, to address the requirement of reducing recurring launch costs, this 
characteristic of the avionics suite needs to be accurately assessed as it is being designed 
and developed. In this way, when design freedom exists, choices can be made which will 
ultimately reduce launch costs. 
The primary objective of this study is to demonstrate a methodology for 
investigating the impact of the avionics suite on the recurring launch cost of the ALS. All 
the factors influencing cost are investigated, however, this study focuses on the 
methodology for quantifying the contribution to the recurring launch costs due to the 
reliability and availability characteristics of the avionics suite. 
Two secondary objectives are pursued. The first is to evaluate the impact of using 
an AlPS Fault Tolerant Processor (FFP) as the avionics computer on the recurring launch 
cost of the ALS. The second objective is to investigate the effect of a number of design and 
operational parameters on the recurring launch cost of the ALS. 
5.2 Problem Definition 
5.2. 1 General Description 
The anticipated mission of the ALS is to launch payloads into orbit about the earth. 
For the purpose of this study, the mission is considered to have two separate phases. The 
first phase of the mission begins when the vehicle is delivered to the launch site, assembled 
and then parked on the launch pad. While on pad the vehicle would undergo testing, be 
loaded with fuel (if necessary), have its status monitored and undergo the other normal 
preflight activities associated with an orbital rocket launch. The vehicle would probably 
remain on the pad for a relatively long length of time. The vehicle would most likely be 
parked a week or more before the scheduled launch date. When the scheduled launch time 
5-I 
vy
	
PRECEDING PAGE BLANK NOT FILMED
did arrive, if the vehicle is judged to be fit for launch, the engines would be ignited and the 
vehicle launched. Launching the vehicle would end the first phase and begin the second. 
The second phase encompasses the time from launch through ascent until the payload is 
delivered to orbit. The time of the launch/ascent phase would be comparatively short - on 
the order of minutes or hours. 
While parked on the pad, the avionics system would be monitored. The Vehicle 
Health Monitoring System (VHMS) is assumed to be a passive subsystem of the ALS 
whose function includes relaying the operational status of the avionics system to the ground 
operations center. The ground operations center has the ability to control the progress of 
the countdown to launch. So, based on the known operational status of the avionics 
system, the ground operations center may proceed with the countdown sequence or 
interrupt the launch before ignition of the engines. 
The bulk of the avionics suite would be carried to orbit along with the payload. 
Since the, avionics suite includes the sensors, actuators and other equipment which would 
be lOcated on the initial stages of the vehicle, parts of the system would be discarded as the 
initial stages are separated and abandoned. However, it is assumed that the bulk of the 
avionics suite - including the avionics computer - would be located in the final stage and 
brought to the delivery orbit of the payload. 
Some other assumptions are made to limit the scope of this study. First, the 
mission is assumed to be unmanned. The missions of the ALS may include manned 
missions. However, associating a monetary cost with the loss of human life brings in 
complications which detract from the objective of this study. (The issue should be 
approached as a safety requirement for the vehicle and not a parameter which can be traded-
off against cost.) Second, the vehicle is assumed to be non-recoverable. Again, the ALS 
may include recoverable subsystems. But, assuming the ALS to be non-recoverable 
simplifies the analyses. Third, the ALS is presumed to have a high probability of mission 
success. 
5.2.2 Contributors to Cost 
The contributors to the cost of the avionics suite are divided into three categories - 
the cost of the avionics system itself, the cost of its weight, and the cost of its unreliability. 
The cost of the avionics system itself is subdivided into fixed and recurring costs. The 
fixed costs are the design and development associated with producing the avionics suite, 
exclusive of the components which will actually perform the mission. This includes the 
development and construction of prototype, validation and verification testing, etc. - any 
costs incurred to produce the working design of the avionics system. The recurring costs 
are the costs of the avionics suite directly related to the specific mission. That is, the actual 
costs to manufacture the system and integrate it with the vehicle given that a validated 
design exists. This includes the costs resulting from any design changes to the hardware or 
software and any testing (acceptance, validation, etc.) which are mission specific. 
5-2
The second category for the cost of the avionics suite is the cost of weight. Since 
the bulk of the avionics system rides with the payload all the way to orbit, it's weight 
subtracts directly from the ALS payload lift capability. Therefore, a cost penalty is 
associated with the weight of the avionics system. 
The third category is the cost of unreliability. The cost of unreliability differs from 
the other two costs in that the actual cost is contingent upon a failure occurring. Therefore 
the cost will depend on a probabilistic event occurring - a failure of one or more 
components within the avionics suite. 
The cost of unreliability is broken up into two subcategories which relate to when 
the cost is incurred - on the launch pad and during launch/ascent. The cost of unreliability 
on the launch pad is derived from actions taken while the vehicle is on the launch pad 
because of detected failures within the avionics suite. While the ALS sits on the pad 
awaiting launch, a failure of a component in the avionics system can occur. if the failure is 
detected then there is the option of taking some action. If the system is fault tolerant and 
sufficient capability for launch still exists then a decision could be made to continue the 
countdown. If sufficient capability for launch no longer exists or it is not desirable to 
launch with the available configuration, then the countdown can be interrupted and the 
failure repaired. Therefore, the cost of unreliability on the launch pad is identified to have 
two contributors, the cost of interrupting the count-down and the actual cost of repairing or 
replacing the failed component. 
Failures of the avionics system occurring while the vehicle sits on the launch pad 
may go undetected. The immediate effect is that no costs are incurred as a result of the first 
phase of the mission. However, these failures may manifest themselves at the time of 
launch and would incur the same costs as catastrophic failures during the launch/ascent 
phase.
The second subcategory of the cost of unreliability is the cost due to avionics 
system failures during launch/ascent. The cost of a failure of the avionics suite during 
ascent is the cost of the entire mission (avionics system, launch vehicle, payloads and 
operational costs associated with the mission), the cost of downtime following the failure, 
the cost of a post launch failure analysis, and less tangible costs such as the cost of launch 
unavailability, the loss of user confidence, and damage to the national image. Note that the 
failure, modes leading to a vehicle failure during flight may include both avionics system 
component failures during the flight and/or during the time on the launch pad. 
5.2.3 Architecture Definitions 
To demonstrate the applicability of the methodology, the architectures to be focused 
on for the avionics suite should be typical of the systems utilized for this application. 
However, if the systems being analyzed are very complex, the task of modeling them will 
detract from the primary objective - illustrating the methodology. Therefore, with the 
5-3
intention of satisfying the primary objective in what is a limited study, the two architectures 
investigated are defined as simplified versions of two potential candidates for the ALS 
avionics. 
The two generic architectures defined for the focus of this study are shown in 
Figures 5-1 and 5-2, respectively. Architecture 1, shown in Figure 5-1, is an architecture 
usually proposed for this type of system. Architecture 2 is presented in Figure 5-2 and is a 
distilled version of an AU'S FTP. This permits a comparative evaluation of the impact to 
the recurring launch cost of using an AU'S FTP as the avionics computer. More 
comprehensive descriptions of the two defined architectures are in the following sections. 
5.2.3.1	 Architecture 1 
Architecture 1 represents an elementary method of incorporating fault tolerance into 
the avionics system. The three channels (Channel-1, Channel-2 and Channel-3) are 
identical computers. Each channel individually processes the information from its 
dedicated sensor and generates an output to control an actuator. (Note that in the actual 
avionics system, each channel would receive data from many sources and generate outputs 
to many other subsystems.) In this generic representation, each of the three sensors 
(Sensor-1, Sensor-2 and Sensor-3) measures the same quantity. Therefore, in the absence 
of failures each channel would ideally produce the same output. The outputs of the three 
channels would then be voted at the actuator and a majority consensus would determine the 
control of the actuator. That is, if the output of one channel was in error (because of a fault 
in a sensor or actuator), two of the three channels would still be providing the correct 
output. The Voting Actuator would ignore the incorrect output and follow the output of the 
two valid channels. 
A number of relevant attributes are assumed to be associated with Architecture 1. 
These are listed below: 
i. The replicated sensors are wired to different processing channels. 
2. No sensor data is being exchanged between the channels. 
3. The channels are asynchronous. 
4. There is no fault masking of channel failures. 
5. No Fault Detection, Isolation and Reconfiguration (FDIR) is being done within 
the avionics system; errors due to sensor or channel failures will not be 
removed from the input to the voting actuator. 
6. The VHMS monitors the health of all components. 
5-4
Sensor-1 I	 -i Channel-1 
Sensor-2 i __________ Voting Channel-2	 Ac uator 
Sensor-3 I	 j Channel-3 
Figure 5-1. Architecture 1 
To Channel-3
Sensor-1 L	 l Channel-1 L	 .'1 Actuator-1 
Sensor-2 F	 Y p.4 Channel-2 I	 ')( u-I Actuator-2 
Sensor-3 k'	 1 Channel-3 k' 	 N Actuator-3 
To Channel-1 
Figure 5-2. Architecture 2 
Architecture 1 contains a single point of failure---the failure of the Voting Actuator. 
Architecture 1 is a generic representation of a fault tolerant architecture typically proposed 
for the avionics suite of this type of system. A single string system is replicated and some 
element must arbitrate between the outputs of the strings. In this case, the Voting Actuator 
acts as the arbitrator and represents a single point of failure for the system. If the Voting 
Actuator is highly reliable and its potential failure is easily predictable (which are both 
plausible properties for the Voting Actuator), then the redundancy within the architecture 
can be expected to significantly contribute to the launch reliability of the avionics suite. 
Note that the difference in the redundancy levels for the three functional groups (Sensors, 
Computers and Actuator) does not preclude a balanced design with regard to the reliability 
allocation for these groups. A balanced design balances the reliability budget allocated to 
5-5 
a-C2
each of the distinguished functional groups, not the means utilized to achieve that 
reliability. 
5.2.3.2 Architecture 2 
Architecture 2 represents a very basic application of the AlPS technology into the 
avionics system of the ALS. Channel-1, Channel-2 and Channel-3 represent the three 
channels of a triplex FTP. Sensor-1, Sensor-2 and Sensor-3 are the same three sensors 
used for Architecture 1. However, their outputs are now hard-wired to each channel of the 
FTP. The replicated actuators (Actuator- 1, Actuator-2 and Actuator-3) individually 
perform the same function as the Voting Actuator in Architecture 1. However, they do no 
voting of channel outputs. Only the output of one channel would be enabled to control 
only one actuator at any one time. 
The relevant attributes of Architecture 2 are listed below: 
1. The replicated sensors are cross-strapped to all processing channels. 
2. In the absence of sensor failures, mid-value selection of sensor data is 
performed within the channels. 
3. There is source congruency between channels. 
4. There is micro-frame synchronization of the channels. 
5. Frequent, exact comparisons of channel outputs and intermediate results are 
performed in order to detect computational failures. 
6. There is fault masking of channel failures. 
7. Failed components are isolated from the control chain. 
8. The health of all components are monitored by the FTP and relayed to the 
VHMS. 
The core FTP of the preliminary ALS avionics architecture is a quadruply redundant 
AlPS FTP. The rational for defining Architecture 2 as a triplex FTP is to make it more 
comparable to Architecture 1. Keeping the redundancy level similar to the more typical 
architecture utilized for this type of application allows a cost comparison to be made 
regarding the implementation of the redundancy and not just a more general comparison 
between two architectures. 
5.3 The Cost Model 
The cost model is implemented as a number of spreadsheets in Microsoft® Excel 
running on an Apple® MacintoshTm computer 1 The use of a spreadsheet program allows 
the easy intermingling of a large number of factors in a tractable manner. This proved to be 
1 Specifically, version 1.5 of Microsoft® Excel is utilized and all of the results shown in this study are 
generated on an Apple® Macintosh Ilci with version 6.1.4 of the Finder file and version 6.0.4 of the 
System file.	 - 
5-6
very advantageous in the development , of the cost model. In total, eleven linked 
spreadsheets comprise the cost model. Appendix A presents the highest level spreadsheet 
for the baseline run. The highest level spreadsheet provides the interface of the cost model 
- it contains the alterable input parameters and provides the more relevant outputs of the 
model.
To facilitate discussion of the cost model, the organization of the spreadsheet of 
Appendix A is followed. Subsections 5.3.1 through 5.3.5 of this report correlate directly 
with Sections I through V of the spreadsheet and are referred to as the cost model is 
presented. 
5.3.1 System Parameters 
The first section of the spreadsheet shown in Appendix A lists the alterable input 
parameters of the cost model. The individual entries are discussed in the sections which 
follow. The general subcategories are discussed here. 
The first four subcategories list the relevant parameters of the avionics suite and 
then the ALS from an increasingly wider prospective. The "Component Attributes" 
subcategory presents the relevant parameters pertaining to the components of the 
architectures presented in Section 5.2.3. The "Avionics System Attributes" subcategory 
includes the cost parameters related to the entirety of the avionics suite for each of the two 
architectures. Subcategory C, "Vehicle Attributes", contains the parameters associated with 
the vehicle which impact the cost model. The "Fleet Attributes" contains only one 
parameter: the number of vehicles in the fleet. 
The fifth subcategory, "Operational Attributes" lists the operational parameters 
which impact the cost model. These are divided between two further subcategories - "On 
the Pad" and "During Launch/Ascent". Theindividual parameters are discussed in Section 
5.3.4. 
5.3.2 Cost of System 
The cost of the avionics system calculated in Section II of the spreadsheet is the cost 
per vehicle to design, develop and construct a fleet of vehicles. It is calculated from 
_Cdd	 (''i_' 
'-system -	 '-component) '-construction 	 (5-1) 
This states that the cost of the avionics system, Qygem, is the sum of three terms. The first 
term is the Design and Development Cost, C1, divided by the Number of Vehicles in the 
Fleet, n. Cdd represents all the fixed costs ofthe system. YCcomponent and Ccocfiofl 
are the recurring COStS.
	 Ccomponent is the sum of the costs of the individual components 
5-7
and Cconstruction is all of the other recurring costs. Cconstruction is taken from the 
Construction Cost (excluding cost of parts) entry of Section I.B of the spreadsheet. 
5.3.3 Cost of Launch Weight 
Since the bulk of the avionics system rides with the payload all the way to orbit, it's 
weight subtracts directly from the ALS payload lift capability. Section HI of the 
spreadsheet associates a cost penalty with this weight. The cost of the avionics system 
launch weight, Cweight, is obtained from 
Cweight = W • Xpayioaci	 (5-2) 
where W is the weight of the avionics system (the sum of the weights of the components 
and the Weight of the Integration Hardware listed in the cost spreadsheet) and Xpayload is 
the Payload Launch Cost. The payload launch cost is the estimated cost for delivering 
payloads to the ALS parking orbit. 
5.3.4 Cost of Unreliability 
Section IV of the spreadsheet computes the costs incurred as a result of the 
unreliability of the avionics system. The costs due to the unreliability differ from the two 
previous categories in that the cost does not materialize until a probabilistic event occurs - 
a failure or sequence of failures which results in either a repair action being taken or a 
mission failure. Therefore, the cost associated with a repair action or of a failed mission 
must be weighted by the probability of the event occurring. This necessitates modeling the 
reliability of the avionics system. 
The cost of the unreliability of the avionics system is conceptually expressed in the 
equation
Cunreijability = P1Cr + PfCf	 (53) 
where
Cunreliabiliry = Cost of unreliability 
Pr = Probability that a repair action is taken 
Cr = Cost associated with repair action 
Pf = Probability that a mission fails 
Cf = Cost associated with a failed mission. 
The first term of Equation (5-3) accounts for the cost of unreliability while on the launch 
pad. The second term is the cost of unreliability during launch. Markov models are 
5-8
utilized to calculate the probabilities of P1 and Pf. The costs Of C1 and Cf are calculated 
directly from the cost entries in Section I of the spreadsheet. 
The models utilized to calculate the reliability of Architectures 1 and 2 are based on 
a single mission scenario. The mission begins when the vehicle is rolled out to the launch 
pad. It then sits on the pad for a specified number of days. If no repair action is taken, the 
engines are ignited and the vehicle launched. The vehicle would then ascend from the 
launch pad to the orbit the payload is to be delivered to. The mission ends successfully 
when the payload is deposited into orbit. 
Three outcomes are possible from this mission scenario. The first is the successful 
completion of the mission. The second is that a decision is made to interrupt the 
countdown to perform a repair. The third is that a mission failure occurs and the vehicle 
and payload are lost. Note that this is slightly different from the actual mission whereby 
the mission continues after a repair action is taken. However, if in the actual mission the 
probability of two or more repair actions taking place is much smaller than the probability 
of only one occurring, then the second and third outcomes of the mission scenario can be 
used as calculations of P1 and Pf, respectively. 
The cost of unreliability is dependent on the repair strategy for detected failures 
within the avionics system. While the vehicle sits on the launch pad, following each 
detected failure of a component, a decision must be made as to whether or not to interrupt 
the countdown and repair the failure (or failures). Even for the simple architectures defined 
in this study many strategies are possible. The most viable ones are defined here. For 
Architecture 1, the analyzed repair strategies are: 
1. Repair when the first failure is detected. All components must be declared 
operational for the launch to occur. 
2. Delay repair until either the failure of the Voting Actuator or both Control 
Chains is detected. (A Channel and its dedicated Sensor are referred to as a 
Control Chain. A failure of either the Channel or its dedicated Sensor results in 
the failure of the Control Chain.) The vehicle is permitted to be launched with 
the detected failure of one Control Chain. 
3. No repairs are performed. The vehicle is launched independent of the 
component failures detected while it resides on the pad. 
For Architecture 2, the analyzed repair strategies are: 
1. Repair when the first failure is detected. 
2. Delay repair until either the failure of 2 Sensors or 1 Channel or 2 Actuators is 
detected. The vehicle is permitted to be launched with only 2 Sensors and/or 2 
5-9
Actuators detected as operational. All 3 Channels must still be detected as 
operational to allow the launch. 
3. Delay repair until either the failure of 2 Sensors or 2 Channels or 2 Actuators is 
detected. 
4. Delay repair until either the failure of 3 Sensors or 2 Channels or 3 Actuators is 
detected. 
5. No repairs are performed. 
The calculation of the two terms of Equation (5-3) are respectively performed in 
Sections IV.A and IV.B of the spreadsheet and discussed in the two sections which follow. 
Section IV.0 of the spreadsheet sums these two terms and reports the total cost due to the 
unreliability of the avionics system. 
5.3.4.1 On the Launch Pad 
Section IV.A of the spreadsheet computes the cost of unreliability on the pad for the 
two architectures analyzed. The first subsection contains the "Intermediate Calculations". 
The "Intermediate Calculations" are the adjustments made to the failure rates and the 
calculation of the cost of interrupting the countdown. The second subsection calculates 
bounds for the cost of unreliability on the pad. 
The base failure rates are adjusted to account for the environment of the launch pad 
and the fact that the avionics system may not be powered up continually while the vehicle 
sits on the pad. The expression
ton 1 
toff (^Xofn 
effective R E= base
 ) 
+ I -
,(5-4) (Itoff 
is used to adjust each respective base failure rate Xb., to provide the effective failure rate 
Aeffective. All of the parameters are taken directly from the Section I.E.1 of the spreadsheet. 
ILE is the Pad Environmental Factor. (ton/toff) and (?ij) are the Ratio of Hours On/Off 
and On/Off Failure Rate Scale Factor, respectively. 
The expression to calculate the cost of interrupting the countdown is taken from 
Reference 3. The Cost of Interrupting the Countdown, C int upt, is calculated from 
(Td +TC)(X- 1)Y 
Cinterrupt
	 (S - 1)	 (5-5) 
5-10
For the ALS, it is assumed that repairs are not made on the pad. If repairs are needed, the 
entire vehicle is brought back to the vehicle assembly area for rework. The Repair Time, 
Td, reflects the time to move the entire vehicle from the pad to the assembly building, where 
the failed unit is replaced, and back to the pad. The total lost time is computed as this 
Repair Time plus the time which had already been invested in the countdown when the 
failure occurred., T. Once a vehicle comes back for repair, it is assumed that surge mode is 
activated until the original launch schedule is reestablished. The Surge Work Rate, S, is a 
scale factor which represents the increase in the nominal work rate during surge mode. The 
Overtime Cost, X, is also a scale factor which represents the increase in the nominal 
Operating Cost, Y, during surge mode. 
The cost associated with a repair action is the Cost of Interrupting the Countdown 
plus the possible cost of replacing the defective components. Therefore, this cost is 
dependent on the failure mode which led to it. So, the first term of Equation (5-3), the cost 
of unreliability while on the launch pad (C 1), is more accurately expressed as 
Cpad =	 PriCri
(5-6) 
where P,,i is the probability of a specific repair action occurring and C1 j is the cost of this 
repair action. Sections 5.3.4.1.1 and 5.3.4.1.2 present the Markov models utilized to 
calculate the probability of occurrence of the repair actions and their associated costs for 
Architectures 1 and 2, respectively. 
5.3.4.1.1	 Architecture 1 
Figure 5-3 presents the Markov model for Architecture 1 for the pad phase of the 
mission. Table 1 describes the states defined by Figure 5-3. In Figure 3, ?, kc and ka are 
the effective failure rates of each Sensor, Channel and Actuator, respectively, a is the 
VHMS Detection and Interruption Rate and ca is the VHMS Self-Test Coverage 
Probability. 
The Markov model shown in Figure 5-3 is utilized to generate the cost of 
unreliability on the pad for all of the defined repairs strategies for Architecture 1. Enough 
resolution is incorporated into the model to also make it compatible with the Markov model 
used for the second phase of the mission. State 1 represents the initial state of the avionics 
system - no failures of its components. States 2 and 3 individually represent the 
occurrence of a first fault in the avionics system. In the case of a fault in one of the Control 
Chains (State 2), two ensuing events can happen. The VHMS could detect the failure 
(State 4) or a near coincidental fault of a second component occurs before this detection can 
take place (State 5). Note that it is conservatively assumed that the VHMS is unable to 
interpret coincidental faults and, at worst, detects neither fault - resulting in an uncovered 
failure of a component. The transitions emanating from State 3 are analogous to those from 
5-11
State 4, with the exception that a detection probability, c 0 , is now associated with the 
detection of the Actuator fault. This is because the Actuator is a simplex component and it 
is assumed the probability of detecting its failure is less than one. 
State 4 represents the failure configuration with one Control Chain providing 
incorrect commands to the Voting Actuator. Architecture 1 continues to operate properly in 
this failure mode. However, the occurrence of another fault will fail the avionics system. 
The reason for distinguishing between States 7, 8, 9 and 10 is to discriminate between 
uncovered and covered failures of the avionics system. 
For the first defined repair strategy for Architecture 1— repair when the first failure 
is detected - States 4, 6, 7, 8, 9 and 10 are the states which represent a repair action has 
been taken. States 1 and 2 are modes from which a successful launch would occur. Being 
in States 3 or 5 would result in a system loss when the vehicle is launched. For this repair 
strategy, the cost of unreliability on the pad is 
Cpad = P6 C 6 + P4,7,8,9,10C4,7 , 8 ,9 ,1 0	 (57) 
where P6 and P4,7,8,910 are the probabilities of being in State 6 and States 4, 7, 8, 9 or 10, 
respectively. C6 and C4,7,8 ,9 1 0 are the respective costs to repair from these states. A 
distinction is made between State 6 and the States 4, 7, 8, 9 and 10 because different failure 
modes triggered the repair actions and greater modeling accuracy can be attained by 
distinguishing between the costs of these repair actions. 
State Description 
1 No Failures 
2 Fault in one Control Chain 
3 Fault in Actuator 
4 Detected fault in one Control Chain 
5 Undetected failure of one or more components 
6 Detected failure of Actuator 
7 Fault in one of two remaining Control Chains 
8 Fault in Actuator following previous detected failure 
9 Detected failure preventing safe launch 
10 Undetected failure of one or more components
Table 5-1. States of Architecture 1 Pad Markov Model 
5-12 
The "First Detected Failure" entry of Section II.A.2.a is calculated directly from 
Equation (5-7). P6 and P4,7,8,9,10 are obtained by solving the Markov model for 
Architecture 1. The bounds result because the defective components may or may not need 
to be replaced. So, C6 and C4,7 , 8 ,9 ,1 0 range between Cinterrupt and Cinterrupt plus the 
maximum number of components which may have to be replaced. The entries for the other 
two repair strategies are calculated analogously. 
5.3.4.1.2 Architecture 2 
The Pad Unreliability Cost for* Architecture 2 is calculated by the same methodology 
used for Architecture 1. However, Architecture 2 is divided up into three independent 
Markov models. One model is constructed for the Sensors, one for the FFP and one for 
the Actuators. The three models are then combined to produce the probabilities of interest. 
Figure 5-4 and Table 5-2 present the Markov model for FTP. Table 5-3 defines the 
symbols in Table 5-2. This model is based on Markov models used for previous reliability 
studies of the VFP. 
Note that a unique structure is recognizable in the FTP Markov model. Starting at 
State 1, the state of the FTP with no failures of its components, a fault can occur in one of 
the three components (Processor, Memory and Interstage) which are distinguished in each 
of its three channels. These faults can be either permanent or transient and are separated 
among States 2 through 7. From these states, one of two events can occur. The first event 
is the FDIR processes could detect, isolate and reconfigure by eliminating the appropriate 
channel if it is a permanent fault (State 8) or resynchronize the affected channel if it is a 
transient fault (return to State 1). The second event is that a second, near coincident fault 
can occur before the FDIR processes are completed and the result is conservatively 
assumed to be an uncovered failure (State 9). 
From State 8, as in State 1, a fault can occur in one of the three components in each 
of the remaining channels. These are divided among States 10, 12, 14, 16, 18 and 20. 
Additional resolution is incorporated at this failure level of the model to separate between 
the detection process and the other processes of FDIR. Since the coverage probability at 
this failure level is most likely to be dominated by the isolation and reconfiguration 
processes, separating the detection processes for the respective faults yields a more accurate 
model. The correct detection of these faults is modeled, respectively, as State 11, 13, 15, 
17, 19 and 21. Note that there is a distinction between a missed detection (State 24) and an 
incorrect isolation or reconfiguration (State 23). State 22 represents the FTP correctly 
operating with a single channel. The occurrence of the next fault fails the FTP. However, 
if the fault is covered (State 25) the launch may be interrupted, whereas, if it is uncovered 
(State 26) the launch may result in a mission failure. 
5-13
Given the structure of the FTP Markov model and the fact that the rates of the FDIR 
processes are much faster than the component failure rates, it is possible to reduce the 
model to the form shown in Figure 5-5. Appendix B discusses the reduction techniques 
utilized to generate the time invariant transition rates of Figure 5-5. The operational states 
of the FTP Markov model are reduced to States 1R (all three channels operational), 2R (only 
two of the three channels operational) and 31 (only one of the three channels operational). 
States 9, 23, 24, 25, and 26 are these same states in Figure 5-4. 
The Markov models for the Sensors and the Actuators are of the same form as the 
FTP Markov model, but are smaller in size. A single, permanent failure rate is associated 
with each Sensor (or Actuator). Their Markov models would appear as Figure 5-4 minus 
States 2 through 6 and 10 through 19 with the parameters for the Sensors (or Actuators) 
replacing those for the Interstages. These models are then reduced to the form shown in 
Figure 5-5 with the techniques of Appendix B. 
The three independent Markov models are combined to generate the entries of 
Section IV.A.2.b of the spreadsheet shown in Appendix A. For each of the defined repair 
strategies, the probabilities of the appropriate states of each of the three models are 
combined to calculate the bounds of the probabilities of the repair actions and weighted by 
the respective costs of these actions as in Equation (5-6). 
Two points should be noted with regard to the input System Parameters 
spreadsheet. None of the repair strategies invoke a repair action when all three Channels 
are detected as failed. So, consequently, there are no Simplex Coverage Probability for 
any of the Channel components. Also, the spreadsheet only allows a single FDIR Rate 
(Recovery Rate) to be input for each type of component failure. The rates for the detection 
and isolation and recovery processes are conservatively assumed to be the FDIR Rate 
(Recovery Rate).
5-14
Figure 5-3. Pad Markov Model for Architecture 1 
5-15
OF	 1 	 2F	 3F 
Figure 5-4. Triplex FTP Markov Model 
5-16
From State -4 To State Transition Rate 
1-2 Apt 
1—*3 App 
1.4 Amt 
1-5 3A1, 
1-46 3A.1 
1  
2-41 
2	 9 2(Apt + App) + 3(Xmt + A.mp) + 3(Aj + 
3-38 Ppp 
3—*9 2(Apt+App)+3(Aint+Xinp)+3(A.it+kip) 
441 Pint 
4---) 9. 3(Apt + App) + 2(A.mt + Amp) + 3(A.j + kip) 
5—*8 Pmp 
5 -4 9 3(Apt + App) + 2(?Lm t + Amp) + 3(Aj + kip) 
6—*1 Pit 
6 -4 9 3(Apt + Apr) + 3(Xmt + Amp) + 2(X + 
7-8 Pip 
7 -99 3(Xpt + App) + 3(Xmt + A.mp) + 2(A. 1 + kip) 
8-+10 2Apt 
8 1 2App 
8—*14 2Xmt 
8-416 2?mp 
8—*18 2?it 
8—+20 2?ip 
10-411 5pt 
10-424 (Apt + Apr) + 2()Lmt +	 + 2(A + 
11—+8 Cpd7pt 
11 ---)23 (1 - Cpd)Ypt + (Apt + Apr) + 2(?m t + Amp) + 2(A + 
12-13 öpp 
12-924 (Apt + Apr) + 2(Amt + 	 + 2(A + A1) 
13-22 Cpcfl'pp
Table 5-2. Transition Rates for FTP Markov Model 
5-17 
From State —* To State Transition Rate 
13 9 23 (1 - CpdYYpp + (Apt + A) + 2(A.m t +	 + 2Oi + 
14-415 8mt 
14--+24 2(Apt + App) + (An 1 + Amp) + 2(A11 + A) 
15-48 CmdYmt 
15-423 0 - Cmd)Imt + 2(Apt + App) + (Amt + Amp) + 2(A 1 + Aip) 
16-417 Bmp 
16 — 24 2(Ap +
	
+ (Amt +	 + 2(A11 + 
17--422 CmdYmp 
17 -4 23 0 - Cmd)Ymp + 2(A 1 + App) + (Am1 + Amp) + 2(A 1 + A1) 
18---) 19 6it 
18 4 24 2(Apt + App) + 2(Amt + Amp) + (A 1 + Aip) 
19-48 CidYit 
19-423 0 - Cid)Yit + 2(A 1 + A) + 2(Amt + Amp) + (A 11 + Aip) 
2021 
20-424 2(A1 + App) + 2(Amt + Amp) + (A11 + Aip) 
21-422 CidYip 
21 -423 0 - Cid)Yip + 2(Apt + A) + 2(Am t + Amp) + (A11 + Aip) 
22 -9 25 c(A1 + A) + Cms(Amt + Amp) + c(A1 + A) 
22 -9 26 (1 - c)(Apt + App) + 0 - Cms)(Amt + Amp) 
+ 0 - c)(A11 + A1)
Table 5-2. Transition Rates for FTP Markov Model (Cont.) 
5-18 
Symbol Description 
Xpt Transient Processor failure rate 
Xpp Permanent Processor failure rate 
Amt Transient Memory failure rate 
mp Permanent Memory failure rate 
kit Transient Interstage failure rate 
kip Permanent Interstage failure rate 
Ppt Transient Processor recovery rate 
Ppp Permanent Processor recovery rate 
Pmt Transient Memory recovery rate 
p, Permanent Memory recovery rate 
Pit Transient Interstage recovery rate 
Pip Permanent Interstage recovery rate 
Transient Processor detection rate 
Permanent Processor detection rate 
mt Transient Memory detection rate 
mp Permanent Memory detection rate 
jt Transient Interstage detection rate 
Permanent Interstage detection rate 
Ypt Transient Processor isolation and reconfiguration rate 
Ypp Permanent Processor isolation and reconfiguration rate 
Ymt Transient Memory isolation and reconfiguration rate 
Ymp Permanent Memory isolation and reconfiguration rate 
Yit Transient Interstage isolation and reconfiguration rate 
Yip Permanent Interstage isolation and reconfiguration rate 
Cpd Duplex coverage probability of Processor failure 
Cmd Duplex coverage probability of Memory failure 
cid Duplex coverage probability of Interstage failure 
cps Simplex coverage probability of Processor failure 
cms Simplex coverage probability of Memory failure 
C is Simplex coverage probability of Interstage failure
Table 5-3. Symbol Definitions for FTP Markov Model 
5-19 
OF	 1 	 2F	 3F 
I	 I	 I 
Figure 5-5. Reduced Triplex FTP Markov Model 
5-20
5.3.4.2	 During Launch and Flight 
Section IV.B of the spreadsheet computes the cost of unreliability from avionics 
system failures during the launch and flight of the vehicle for each of the investigated repair 
strategies. The first subsection contains the "Intermediate Calculations'. The "Intermediate 
Calculations" are the cost associated with a mission failure and the adjustments made to the 
failure rates. The second subsection of the spreadsheet calculates the bounds for the cost of 
unreliability during launch and flight. 
The costs associated with the failure of the avionics system during ascent are 
discussed in Section 5.2.2. For the purposes of this study, this cost (Cf in Equation (5-3)) 
is limited to the cost of the avionics system, the launch vehicle, payloads and operational - 
costs associated with the mission. It is calculated in the spreadsheet directly from 
Cf = Csystem + Wpayload Xpayioaci + Cpayload
	
(5-8) 
where Csystem is the cost of the avionics system itself (calculated in Equation (5-1)); 
Wpayload is the Total Payload Weight; Xpayload is Payload Launch Cost; Cpayload is the 
Payload Value. 
The base failure rates are adjusted to account for the environment of the 
launch/ascent, as was done for the on pad phase of the mission. However, for this phase 
of the mission the avionics system is assumed to be powered up for its entirety. So, the 
relation used to adjust the base failure rates becomes 
-effective = ltE?.base
	 (5-9) 
where ltE is now the Launch Environmental Factor. 
Sections 5.3.4.2.1 and 5.3.4.2.2 discuss the Markov models utilized to calculate 
the probability that a mission failure occurs in order to generate the cost of unreliability 
during launch. 
5.3.4.2.1	 Architecture 1 
Figure 5-6 shows the Markov model for the launch/flight phase of the mission. 
Table 4 describes the states defined by Figure 5-6. During launch it is no longer possible 
to interrupt the countdown to repair failed equipment within the avionics system. 
Therefore, the VHMS no longer has an impact on the unreliability. Hence, the model 
depicted in Figure 5-3 collapses into the first 4 states of the Launch/Flight Markov model. 
State 5 accounts for the probability that a repair action is undertaken during the on pad 
phase of the mission.
5-21
The initial probabilities of the states of the Launch/Flight Markov model are derived 
from the probabilities of the states of the Pad Markov model at the end of the Time on Pad. 
The distribution is dependent on the repair strategy. For example, for the strategy which is 
to repair on the first detected failure the initial probabilities of the Launch/Flight Markov 
model are as listed in Table 5-5. The initial probabilities for the other two repair strategies 
are analogously made. 
Section IV.B.2.a of the spreadsheet calculates the Launch/Flight unreliability cost 
of Architecture 1. The Launch/Flight Markov model is solved using the initial conditions 
for each of the repair strategies. The sum of the probabilities in States 3 and 4 at the end of 
the Flight Time is Pf. Cf has been previously calculated. Therefore, Launch/Flight 
unreliability cost is calculated according to the second term of Equation (5-3). 
OF	 1 	 2F I 
NO 
Figure 5-6. Launch/Flight Markov Model for Architecture 1 
5-22
State
Description 
1 No failures 
2 Fault in one Control Chain 
3 Fault in Actuator, System failure 
4 System failure 
5 Countdown interrupted to perform a repair 
Table 5-4. States of Architecture 1 Launch Markov Model 
State Initial Probability is Probability at End of Time On 
Pad of Listed States of Pad Markov Model 
1 1 
2 2 
3 3 
4 5 
5 4,6,7,8,9,10
Table 5-5. Initial Probabilities of Launch/Flight Markov Model for 
First Detected Failure Repair Strategy of Architecture 1 
5.3.4.2.2 Architecture 2 
The Launch Unreliability Cost for Architecture 2 is calculated according to the same 
methodology used for Architecture 1. The three Markov models used to calculate the Pad 
Unreliability Cost for Architecture 2 are modified slightly to model the cost of unreliability 
during the launch/flight phase of the mission. An additional state, analogous to State 5 of 
5-23 
the Launch/Flight Markov model for Architecture 1, is added to each of the three Markov 
models to account for the probability that a repair action is undertaken during the on pad 
phase of the mission. It is also assumed that during the launch phase of the mission 
channels which are removed by the FDIR processes because of a transient fault are not 
brought back into service. Therefore, in the FTP Launch/Flight model transient faults are 
modeled as permanent faults. 
As is done for Architecture 1, the initial probabilities of the Launch/Flight Models 
are derived from the probabilities of the states of the Pad Markov models at the end of the 
Time on Pad. This redistribution of the probability is dependent on the repair strategy and 
is done analogous to the example presented in Section 5.3.4.2.1. 
Section IV.B.2.b of the spreadsheet calculates the Launch/Flight unreliability cost 
of Architecture 2. The three Launch/Flight Markov models are solved using the initial 
conditions for each of the repair strategies and combined to generated P f. The entries of 
Section IV.B.2.b of the spreadsheet evaluate Pf for the Flight Time and multiply it with the 
previously calculated Cf to produce the Launch Unreliability Cost. 
5.3.5 Total Cost 
Section V of the spreadsheet adds the results of Sections II, IN and IV and presents 
the result for each of the repair strategies for the two architectures. This is the total cost for 
the respective avionics system for the costs which are accounted for in this study. 
5.4 Results 
The cost model described here is used to predict the respective costs of the two 
defined avionics systems - Architecture 1 and Architecture 2. The Baseline System 
Parameters are listed as the appropriate entries of Section I of the spreadsheet shown in 
Appendix A. Sections II through V of the spreadsheet show the intermediate and final 
results using the Baseline System Parameters. 
The Baseline System Parameters of Section I are estimates based on currently 
available technology and the current operational practices for commercial launch vehicles. 
The Component Attributes are chosen under the assumption that Class B parts are utilized. 
The Component Attributes and the Avionics System Attributes inputs for Architecture 2 are 
selected based on past experience in applying the AlPS technology to similar systems. The 
Component Attributes and the Avionics System Attributes inputs for Architecture 1 are 
chosen so that they mimic those of Architecture 2. In this way, the differences in the 
generated results from these two implementations are less obscured by the effect of the 
individual parts. (In actuality, it is reasonable to use equivalent values between 
Architectures 1 and 2 for the Component Attributes and the Avionics System Attributes. 
As justification, refer to References 23 and 24 which report the experience with an 
5-24
asynchronous architecture similar to Architecture 1 for the Flight Control System of the 
AFT/F- 16 aircraft.) 
The Baseline System Parameters of Subsections C, D and E are, again, reasonable 
estimates based on experience. The CSDL Report CSDL-R-2109 [15] and Military 
Handbook MIL-HDBK-217E [16] served as sources for many of the Operational Attributes 
of Subsection E. 
Sections II through V of the spreadsheet in Appendix A present the results 
produced by the cost model for the Baseline System Parameters. 2 Figure 5-7 organizes the 
more relevant data of the spreadsheet into a more enlightening form. The stacked bar chart 
shows the total cost, and breaks down the sources which make up the total cost, for each of 
the investigated repair strategies of the respective architectures. It is obvious that the total 
cost varies significantly between each of the repair strategies for each architecture and 
between the two architectures. These variations are mainly attributed to the cost of 
unreliability. The cost of the avionics system itself and the cost of its launch weight are 
independent of the repair strategy for any particular architecture and, relatively, do not vary 
much between the two architectures. The cost of its launch weight for both architectures 
represents a small contribution to the total cost. 
The two classifications for the cost of unreliability (On the Launch Pad and During 
Launch/Ascent) monotonically vary with regard to the repair strategy, but in different 
directions. In going from the less tolerant repair strategy (First Detected Failure) to the 
most tolerant repair strategy (No Repairs), the cost of unreliability on the launch pad 
decreases, whereas the cost of unreliability during launch/ascent increases. For 
Architecture 1, these two effects are minimized for the repair strategy which delays repair 
until either the failure of the Voting Actuator or both Control Chains is detected. For 
Architecture 2 the optimal repair strategy with regard to cost is to delay repair until either 
the failure of 2 Sensors or 2 Channels or 2 Actuators is detected - this results in a total 
cost of about $362,000. Note that for both architectures, significant reductions in cost can 
be obtained by permitting the vehicle to be launched with detected failures within the 
avionics system. 
Architecture 2 shows the greatest opportunity for minimizing cost. For the Baseline 
System Parameters, Architecture 1 has a greater potential for a failure during launch than 
Architecture 2. Figure 5-7 shows that, for Architecture 1, even when the policy is not to 
allow launch with any detected failures, the total cost is dominated by mission failures 
during launch. The only component of its total cost which can be reduced by allowing 
launch with detected failures (Cost of Unreliability - On the Pad) is relatively small. 
2 Note that in the Adjusted Reliability Parameters of the Fl?
-
Launch
-
Model worksheet shown in Section 
IV.B.l of the spreadsheet presented in Appendix A the transient failure rates of the FTP Channel 
Components are indicated as 0. During the launch phase of the mission, transient failures of a Channel 
have the same impact as permanent failures since a Channel will not be resynchronized once it is isolated 
off-line. The transient failure rate is accounted for in the respective permanent failure rate of the worksheet. 
5-25
Alternately, the cost of unreliability for Architecture 2 for the case which permits launch 
only when there are no detected failures within the avionics system is dominated by the 
unreliability on the pad. Therefore, potential exists for significantly reducing the total cost 
by allowing launch to take place with some detected failures of its components. As pointed 
out, the optimal repair strategy is to delay repair until either the failure of 2 Sensors or 2 
Channels or 2 Actuators is detected. Note that the total cost of this repair strategy is 
substantially less than that of any of those defined for Architecture 1. 
5
2 Cost of Unreliability - Launch 
[II Cost of Unreliability - On the Pad 
Cost of Launch Weight 
4	 0 Cost of Avionics System 
1 
[I] (I)
..	 ri d i p L)	 L	 -. 4-e	
-O	 O	 -O .O 
c.)0 .	 h
	
U	 U	 U. 
— 
	
0	 0	 0 
	
V	 V	 4) C,, 
Figure 5-7. Baseline Results 
5-26
Using the component quality factors listed in Table 5-6, Figure 5-8 is generated 
which shows the sensitivity of total cost to component quality. Table 5-6 is constructed 
from information in CSDL Report CSDL-R-2109 [15] and Military Handbook MIL-
HDBK-217E [ 1 6] . Table 5-6 specifies the factor by which the failure rates of the 
components are multiplied by (Quality Factor) and their respective costs are multiplied by 
(Cost Factor) when a different quality level is assumed from the Baseline System 
Parameters. With the exception of the No Repairs repair strategy, utilizing B quality level 
parts shows itself to be the most economical. If the repair policy is not to repair any 
detected failures within the avionics system while the vehicle sits on the launch pad, using 
S quality level parts minimizes the total cost. 
Quality Level Quality Factor Cost Factor 
S 0.25 10.0 
B 1.0 1.0 
D 10.0 0.50
Table 5-6. Component Quality Factors 
Figure 5-9 shows the sensitivity of the total cost for each of the repair strategies of 
each of the two architectures to the Payload Value. For each architecture, the sensitivity to 
Payload Value appears to be dependent on how lax the launch requirement is. The repair 
strategies which permit launch with more detected failures tend to be more sensitive to the 
Payload Value. However, the repair strategies of Architecture 2 (the architecture based on 
the AlPS FTP), excepting the No Repairs repair policy, are relatively insensitive to the 
Payload Value over the region shown. 
Figures 5-10 and 5-11 present the sensitivity of the total cost for each of the repair 
strategies of each of the two architectures to the Time On Pad and Flight Time, 
respectively. As shown in Figure 5-10, the optimum repair strategy for each architecture 
appears to be least sensitive to the Time On Pad (Actuator or 2 Chains for Architecture 1; 2 
Sensors or 2 Channels or 2 Actuators for Architecture 2). However, the first four repair 
strategies of Architecture 2 all remain less than the optimal strategy for Architecture 1 even 
when the Time On Pad is increased from the Baseline value of 7 days up to 30 days. As 
shown in Figure 5-11, increasing the Flight Time from the Baseline value of 0.167 hours 
amplifies the difference between the two architectures. All of the repair strategies of 
Architecture 2 are less sensitive to this parameter than any of the strategies of Architecture 
1. The least tolerant repair strategies of Architecture 2, First Detected Failure and 2 
Sensors or 1 Channel or 2 Actuators, are least sensitive to the Flight Time. 
5-27 
5 
4 
1 
0 
I
3 
2
SBD SBD SBD SBD SBD SBD SBD SBD 
H
.-
L
.— riv C,4 c r' 
cq
0. 
C4 
L) I-. 
0 -0 .0 •o 
— o
-
U U 
—
U
eq 
V C4 
N en
Figure 5-8. Sensitivity to Component Quality 
5-28 
Architec we 1 
No Rep 
Arc] ecWe2 2  
Firs Detected Failure No Repairs 
Axchitecnire2
Architecu 1 
2 Sens or 1 (
.... 
han or 2 Acts	
Dete ted Failwi 
Architecture 2 Architecture 1 
2 Sens or 2 Chan c r 2 Acts	 Actuator or 2 Ch iins 
Lj	 3 Sens or 2 han or 3 Acts
8 
7, 
6. 1
5. 
0
4. 
!3.
0.5	 1.0	 1.5	 2.0 
Payload Value (Billions of Dollars) 
2 
1
Figure 5-9. Sensitivity to Payload Value 
Figure 5-12 displays the sensitivity of the total cost for each of the repair strategies 
of each of the two architectures to the Repair Time. For the region shown, little sensitivity 
to this parameter is observed. For each of the respective architectures, the strategies which 
allow the vehicle to be launched with more detected failures tend to be the least sensitive to 
Repair Time. 
5.5. Conclusions and Suggestions for Future Work 
A useful methodology has been demonstrated for investigating the impact of the 
avionics suite to the recurring cost of the ALS. The methodology evaluates the cost of the 
unreliability of the avionics suite and includes this along with a more traditional assessment 
of its cost. This allows a truer appraisal of its recurring launch cost to be made. During 
5-29
design and development of the avionics system, this methodology would allow the 
designer to quantitatively predict the impact on cost that design decisions will have. In this 
way, when design freedom exists, choices can be made which will ultimately reduce launch 
costs. The methodology also measures the impact on cost that operational decisions can 
have - such as allowing the vehicle to launch with detected failures. 
When the avionics system is viewed from the perspective of being mission critical, 
the cost of its unreliability can represent a significant portion of its total cost. Therefore, 
the parameters which affect its unreliability also affect its recurring launch cost. For the 
two architectures analyzed in this study, the impacts of two parameters are of particular 
interest. The first is the repair strategy which is employed for failures detected while the 
vehicle sits on the launch pad. Pre-empting launch on the occurrence of the first detected 
failure within the avionics system is not economical. Allowing the vehicle to launch with 
selected detected failures can potentially reduce the recurring launch costs. The second 
parameter of interest is the quality level of the components used. The use of Class B parts, 
as compared with Class S and D parts, minimizes the recurring launch cost for most of the 
repair strategies analyzed. 
Based on the analysis performed, the AlPS FTP architecture (Architecture 2) shows 
itself to be potentially more economical than the more typical architecture employed for this 
application (Architecture 1). The more viable repair strategies of the AlPS FTP architecture 
show themselves to be much more cost effective than any of the strategies of Architecture 1 
for the Baseline System Parameters. These repair strategies of the AlPS FTP architecture 
also appear to be more insensitive to a number of the major system parameters. For the 
AlPS FTP architecture, allowing the ALS to launch with detected failures in 1 Sensor 
and/or 1 Channel and/or 1 Actuator (delaying repair until failures are detected in at least 2 
Sensors or 2 Channels or 2 Actuators) appears to be the most economical repair strategy. 
Though the methodology presented here shows itself to be very useful in 
quantifying the impact of the reliability of the avionics system on the recurring launch cost 
of the ALS, the actual analysis performed is of limited benefit. The models of the two 
architectures which are the focus of the study are quite simplified in nature and lacked the 
complexity of an actual avionics system. Therefore, in the future it would be most useful 
to develop more detailed models that reflect the realistic architecture complexities and apply 
the techniques illustrated here to these models. More specific conclusions regarding the 
architecture and the most optimum repair policies could be made. 
With regard to the methodology itself, some areas warrant further work. The costs 
associated with interrupting the countdown and a failed mission could be modeled better. 
The less tangible costs such as a delay in a launch schedule because of the need to do a 
repair or because a vehicle is lost in flight needs to be investigated further. The analysis 
5-30
Architecture 1 
No Repairs  
Architecture 1	 1 
First Detected Failure	 N 
Architecture 1	 Arc 
Actuator or 2 Chain	 Fir
rchitecture 2 
o Repairs 
hitecture 2 
t Detected Failure 
Architecture 
2 Sens orl Chanor2Acts 
JArchitecture 2 
2 Sens or 2 Chan or 2 Acts 
/ Architecture 2 
3 Sens or 2 Chan or A I Acts 
-----------------
_._
IL
10
	
20 
counterparts. 
10. 
9. 
8. 
7. 
0
6 
5. 
4. 
E2	 3. 
2 
l
could also be expanded to include other critical subsystems of the ALS along with the 
avionics system. The methodology might also be enhanced to include reusable subsystems 
within the ALS and possibly compare the costs of these architectures with their expendable 
Time On Pad (Days) 
Figure 5-10. Sensitivity to Time On Pad 
5-31 
Architecture 1 
No Repairs 
Architecture 1 
First Detected Failure
Architecture I 
Actuator or 2 Chains 
ithitecture 2  
No Repairs 
/ Architecture 2 I 2 Sens or 1 Chan o 2 Acts 7 Sens or 2 Chan or 2 Acts 
Architecture 
3 Sens or2( ianor3Acts
Architecture 2 
- ' First Detected Failure
10. 
9. 
8. 
7. 
6. 
5 
4 
3 
2 
[I]
iJ
	 2
	
3
Flight Time (Hours) 
Figure 5-11. Sensitivity to Flight Time 
5-32 
Architecture 1 
No Repairs 
Our 
No
ifitectaie 2 
Repairs  
Architec ire I 
Ft Dei cted Failure 
Architecti e i Architecture 
Actuator r2Chains
- 4;7De
Failure 
3 
Ai.hitecture 2 
3ensor2Chan or
34chaLor
4 Architecture k2Se1ol Man or2Acts 
 2 
S2Acts -
5 
41 
C
1. 
I]
3 0 
1,J 
0 
I 2. 
0	 2	 4	 6
	 8	 10
Repair Time (Days) 
Figure 5-12. Sensitivity to Repair Time 
5-33
5-34
6.0 SUMMARY AND CONCLUSIONS 
The validated fault tolerant building blocks of the Advanced Information Processing 
System (AlPS) have been configured to meet the reliability, availability, performance and 
other avionics requirements of the Advanced Launch System (ALS) being developed jointly 
by the National Aeronautics and Space Administration and the Department of Defense to 
launch heavy payloads into low earth orbit. This report has described the AlPS for ALS 
architecture synthesis process starting with the ALS mission requirements and ending with 
an analysis of the candidate ALS avionics architecture. The ALS architecture synthesis 
process followed a new design for validation methodology that has been developed as part 
of the AlPS program to assure that fault tolerant computer system architectures for 
advanced applications meet the reliability, performance and other goals of the application. 
The preliminary ALS avionics requirements were obtained by CSDL from the prime 
contractors via Martin Marietta. The detailed computational requirements were developed 
by Martin Marietta and jointly refined by MM and CSDL. The Reliability, Maintainability, 
and Availability (RMA) requirements for the ALS were defined for the launch pad 
operations, the launch or the boost phase, and for on-orbit operation. The ALS avionics 
availability requirement was defined to be 95%, i.e., the ALS avionics must have a fault 
masking capability with a probability of 0.95, at the end of one week on the launch pad. 
The avionics reliability requirement was defined in terms of the maximum allowable 
probability of mission failure or vehicle loss due to avionics failure. This failure 
probability should not exceed 10 5 for the mission duration which may vary from 10 
minutes for short missions to as long as 48 hours for the longest ALS mission. 
The computational requirements consisted of nine top-level functions: Central 
Control and Processing, Winds Ahead Determination, Vehicle Power System Management, 
Steering and Staging Control, Propulsion Control, Command and Telemetry Processing, 
Range Safety and Destruct, and Programmable Payload Interface. Aggregate throughput 
estimates were given for many of these functions. Each function was divided into 
subfunctions which was supposed to be further defined at the atomically schedulable task 
level. However, this process was carried out only for the Central Control and Processing 
function during the course of this study. 
Based on the data provided by MM, the overall ALS throughput requirements were 
estimated to be approximately 8.8 MIPS for non-propulsion functions and 4.8 MIPS per 
engine for propulsion functions. The inter-function communication bandwidth 
requirements were estimated to be 26 Mbits/sec (prelaunch) between the non-propulsion 
and the propulsion functions (17 engines), the non-propulsion I/O bandwidth requirement 
was estimated to be 11.2 Mbits/sec, and the propulsion I/O bandwidth requirement was 
estimated to be 1.587 Mbits/sec (17 engines). 
6-1
PRECEDING PAGE BLANK NOT FILMED
The flight system characteristics of the AU'S hardware and software building 
blocks were defined based on a survey and projection of technology expected to be 
available in the ALS time frame. The characteristics included the physical characteristics, 
performance projections, hardware implementation, module failure rates, and a packaging 
concept.
A quadruply redundant AlPS for ALS Fault Tolerant Processor will consist of eight 
SEM-E modules -- two CPU modules, one Shared Devices module, one Communicator 
and Interstage module, and Network Interface Sequencer modules, and two power 
conversion modules. The SEM-E modules are 5.88 in. x 6.68 in. x 0.6 in. and weigh 
approximately 1 lb. each. A quad FTP, including all enclosures, will weighabout 108 ibs, 
occupy 2 cubic feet and consume about 165 Watts of power. A triplex FTP will weigh 
about 81 lbs, Occupy about 1.5 cubic feet, and consume about 124 Watts. 
The raw throughput of the AU'S for ALS FTP, consisting of two 40 MHz RISC 
microprocessors per channel, is projected to be 30 MIPS using the DAIS mix benchmark. 
The useful FTP throughput available to the ALS applications tasks, after accounting for all 
the fault tolerance and core hardware redundancy management overheads (but not the 
sensor RM), is projected to be about 25 MIPS. The I/O and intercomputer network 
bandwidths will be 100 Mbits/sec. The internal data exchange bandwidth of the FTP will 
be about 64 Mbits/sec. The end-to-end communication time over the intercomputer 
network between functions located on different FTPs is expected to be less than 2 msecs 
per message. The average recovery time from a fault in the FTP is projected to be 10 
msecs.
The raw hardware failure rates per hour, using Class B components, are projected 
to be: 6.5x10 5 per FTP channel and 1.0x10 5 per node on the launch pad; 3.3x10 4 per 
FTP channel and 1.1x10 4 per node during the boost phase; 2 . 3*10-5 per FTP channel and 
7.6* 10-6 per node in orbit. These are the permanent failure rates. Transients are assumed 
to occur at 10 times the permanent failure rates. 
Using the building block performance and reliability projections, a preliminary 
AU'S-based fault tolerant computer system architecture was configured to meet the ALS 
avionics requirements. A single quadruply redundant AU'S Fault Tolerant Processor, the 
core FTP, will perform all the non-propulsion functions required in the ALS. Additionally, 
there will be a propulsion control FTP dedicated to each engine. The core FTP will access 
the guidance, control, navigation and other sensors and actuators on one redundant I/O 
network. Each of the engine control FTPs will access engine sensors and actuators on a 
dedicated I/O network. The core FTP and all of the engine control FFPs will be connected 
by a fault-masking triply redundant intercomputer network. 
For short ALS missions, lasting an hour or less, it will not be necessary to 
reconfigure the FTPs or the I/O and IC networks. Redundant hardware would provide 
6-2
sufficient fault masking capability to meet the ALS reliability requirement. However, for 
longer ALS missions, lasting 1 hour to 48 hours, it will be necessary to re-integrate FTP 
channels affected by transient fault and to reconfigure the 1/0 and IC networks. The 
performance projections show that these in-flight reconfigurations can be accomplished 
without suspending any of the ALS applications tasks. The reconfigurability of the 
networks is also intended to obviate expensive launch pad repairs. 
Extensive analytical modeling of the AlPS for ALS architecture was carried out to 
predict its reliability and availability. For the baseline architecture, consisting of one core 
non-propulsion FFP and 17 propulsion control FTPs, all quadruply redundant, the launch 
availability is expected to be 97.9%. This configuration will also meet the mission 
reliability goals. Specifically, the mission probability of failure is expected to be 9•75* 106 
This exceeds the goal of 10 5 just slightly. 
The contributions to the unavailability and unreliability come predominantly from 
the propulsion avionics since 17 out of the 18 FTPs are for engine control. The availability 
of the non-propulsion avionics, which consist of just 1 FTP, is 99.88%. Similarly, the 
mission loss probability, attributable to non-propulsion avionics, is 5.42*10-7. Evidently, 
one needs to reexamine the requirement of dedicating a controller to each engine. If, for 
example, an FTP was configured to control 4 engines, which it is capable of doing based 
on the performance projections, only 4 FTPs will be necessary to control 16 engines. This 
would reduce the probability of mission failure due to a failure of propulsion avionics to 
2.16* 10-6. The launch availability would improve to 99.52%. 
The ALS vehicle and engine designers should seriously examine the option of 
integrating engine controllers outside the engine and with the core avionics. This would 
not only improve the overall ALS reliability and availability, as demonstrated above, but 
would also result in reduced weight, volume, power and cost. 
The AlPS for ALS architecture defined here is preliminary in nature but shows that 
the ALS performance and reliability requirements can be met by the AlPS hardware and 
software building blocks that are built using the state-of-the-art technology available in the 
1992-93 time frame. The level of detail in the architecture definition reflects the level of 
detail available in the ALS requirements. As the avionics requirements are refined, the 
architecture can also be refined as well as defined in greater detail with the help of analysis 
and simulation tools. For example, the functions in the core FTP need to be allocated to the 
computational processor and the 1/0 processor. This requires a more detailed enumeration 
of interfunction communication requirements and 1/0 communication requirements. Also, 
no effort was expended on defining the detailed 1/0 architecture. This requires as inputs 
the sensor details such as the number and type of sensors, their failure rates, and so on. 
This information can be used to define redundancy levels of sensors and allocate sensors to 
different redundant layers of the 1/0 network. 
6-3
The data presented by Martin Marietta served as an important starting point for 
determining the requirements of the ALS avionics. During the course of the CSDL-Martin 
Marietta interaction an active dialogue was set up which would have in time resulted in a 
more complete definition of a common requirements vocabulary and facilitated the 
acquisition of comprehensive requirements data. 
Several variations on the baseline architecture presented here are also possible and 
should be modeled and analyzed. These include allocating several engines to one 
propulsion control FTP, investigating the effects of launch with failures, and using 
authentication for the I/O and IC networks. 
To complete the design of a validated ALS avionics architecture, the architecture 
synthesis process begun in this study needs to be followed up by the steps outlined in the 
design for validation methodology. Once an architectural configuration has been selected 
that satisfies the reliability, performance and other ALS avionics requirements, the next step 
is the detailed hardware and software design. The detailed design phase will utilize the 
AlPS building block design knowledgebase. Once the building blocks have been 
fabricated using the state-of-the-art microelectronics and the system services implemented 
using the latest Ada Run Time System and compiler, the validation of the integrated 
avionics system can commence. This validation pertains to the specific hardware and 
software implementation and not the architecture or the building block characteristics since 
these have been prevalidated. A test and evaluation of the avionics integrated with 
applications and actual or simulated 1/0 should confirm the predicted performability 
characteristics. Any discrepancies between the predicted and the actual performability 
should be minor and traceable to the detailed implementation phase rather than the 
architectural or building block design. These can be corrected by refining the 
implementation. 
Since a prime design driver for ALS is the cost, a study was also undertaken to 
analyze the impact of the avionics architecture on the launch cost of ALS. A methodology 
was developed to quantify the contribution to the recurring launch costs due to the 
reliability and availability characteristics of the avionics. The resulting cost model was then 
used to predict and compare the costs of two different architectures for the ALS avionics. 
The two architectures modeled were defined as simplified versions of two potential 
candidates for the ALS avionics. Architecture 1 represents an elementary method of 
incorporating fault tolerance into the avionics system. Architecture 2 is a simplified version 
of the AlPS for ALS architecture defined in the current study. 
A useful methodology has been demonstrated for investigating the impact of the 
avionics suite to the recurring cost of the ALS. The methodology evaluates the cost of the 
unreliability of the avionics suite and includes this along with a more traditional assessment 
of its cost. This allows a truer appraisal of its recurring launch cost to be made. During 
design and development of the avionics system, this methodology would allow the 
designer to quantitatively predict the impact on cost that design decisions will have. The 
6-4
methodology also measures the impact on cost that operational decisions can have - such 
as allowing the vehicle to launch with detected failures. 
When the avionics system is viewed from the perspective of being mission critical, 
the cost of its unreliability can represent a significant portion of its total cost. Therefore, 
the parameters which affect its unreliability also affect its recurring launch cost. For the 
two architectures analyzed in this study, the impacts of two parameters are of particular 
interest. The first is the repair strategy which is employed for failures detected while the 
vehicle Sits on the launch pad. Pre-empting launch on the occurrence of the first detected 
failure within the avionics system is not economical. Allowing the vehicle to launch with 
selected detected failures can potentially reduce the recurring launch costs. The second 
parameter of interest is the quality level of the components used. The use of Class B parts, 
combined with a fault tolerant architecture, as compared with Class S and D parts, 
minimizes the recurring launch cost for most of the repair strategies analyzed. It should be 
noted here that the current philosophy for launch system avionics is to use the highest 
quality, that is, Class S, components in a single string, non-fault tolerant architecture. 
Based on the analysis performed, the ALPS FTP architecture (Architecture 2) shows 
itself to be potentially more economical than the more typical architecture employed.for this 
application (Architecture 1). The more viable repair strategies of the AlPS FTP architecture 
show themselves to be much more cost effective than any of the strategies of Architecture 1 
for the Baseline System Parameters. These repair strategies of the ALPS FTP architecture 
also appear to be more insensitive to a-number of the major system parameters. For the 
ALPS FTP architecture, allowing the ALS to launch with detected failures in 1 Sensor 
and/or 1 Channel and/or I Actuator (delaying repair until failures are detected in at least 2 
Sensors or 2 Channels or 2 Actuators) appears to be the most economical repair strategy. 
Though the methodology presented here shows itself to be very useful in 
quantifying the impact of the reliability of the avionics system on the recurring launch cost 
of the ALS, the actual analysis performed is of limited benefit. The models of the two 
architectures which are the focus of the study-were quite simplified and lacked the 
complexity of an actual avionics system. Higher fidelity models that reflect the details of 
the architecture need to be developed. 	 - 
With regard to the methodology itself, the costs associated with interrupting the 
countdown and a failed mission could be modeled better. The less tangible costs such as a 
delay in a launch schedule because of the need to do a repair or because a vehicle is lost in 
flight needs to be investigated further. The analysis could also be expanded to include 
other critical subsystems of the ALS along with the avionics system. The methodology 
might also be enhanced to include reusable subsystems within the ALS and possibly 
compare the costs of these architectures with their expendable counterparts. 
6-5
7.0 REFERENCES 
1. R. Harper, L. Alger, and J. Lala, "Advanced Information Processing System: Design 
and Validation Knowledgebase", NASA Contractor Report 187544, September, 1991. 
2. R. Cole, "Advanced Information Processing System for Advanced Launch System: 
Hardware Technology Survey and Projections", NASA Contractor Report 187555, 
September, 1991. 
3. Hypercard Stack "XPROCESSING," Martin Marietta Astronautics Group, Denver, 
CO 7 December 1989. 
4. Hypercard Stack "ZARCS," Martin Marietta Astronautics Group, Denver, CO 7 
December 1989. 
5. S. Johnson, "IRAD ALS Requirements Summary," Martin Marietta Astronautics 
Group, Denver, CO. 8 November 1989. 
6. "Future Launch Vehicle (ALS_PLUS) Requirements," Martin Marietta Astronautics 
Group, 29 October 1989. 
7. J. H. Lala and S.J. Adams, "Inter-Computer Communication Architecture for a Mixed 
Redundancy Distributed System", Journal of Guidance, Control, and Dynamics, Vol. 
12, No. 4, July-August 1989. 
8. L. Alger, and J. Lala, "Performance Evaluation of a Realtime Fault Tolerant 
Distributed System", 23rd Hawaii International Conference of System Sciences, 
Kailua-Kona, Hawaii, January, 1990. 
9. L. Burkhardt, L. Alger, R. Whittredge, and P. Stasiowski, "Advanced Information 
Processing System: Local System Services",NASA Contractor Report 181767, April, 
1989. 
10. T. Masotto and L. Alger, "Advanced Information Processing System: Input/Output 
System Services", NASA Contractor Report 181874, August, 1989. 
11. G. Nagle, L. Alger and A. Kemp, "Advanced Information Processing System: 
Input/Output Network Management Software", NASA Contractor Report 181678, 
May, 1988. 
12. L. Burkhardt, T. Masotto J. Terry Sims, R. Whittredge and L. Alger, "Advanced 
Information Processing System: Inter-Computer Communication Services", NASA 
Contractor Report 187556, September, 1991. 
13. S. J. Adams, M. Dzwonczyk, "Techniques for Transient Error Recovery and 
Avoidance in Redundant Processing Systems", NATO/AGARD 49th Symposium on 
Fault Tolerant Design Concepts for Highly Integrated Flight Critical Systems, 
Toulouse, France, October 10-13, 1989. 
14. S. J. Adams, "Hardware assisted Recovery from Transient Errors in Redundant 
Processing Systems," 18th International Symposium on Fault Tolerant Computing, 
Chicago, IL, June 1989.
7-1
PRECEDING PAGE BLANK NOT FILMED
15. Babcock, IV, P. S. and K. C. Hell, A Cost/Reliability Model of Electronic Functions, 
CSDL-R-2 109, The Charles Stark Draper Laboratory, Inc., Cambridge, 
Massachusetts, October 1988. 
16. Military Handbook, Reliability Prediction of Electronic Equipment, MIL-HDBK-
217E, Department of Defense, Washington, D. C., October 27, 1986. 
17. H. Kando, T. Iwazumi and H. Ukai, "Singular Perturbation Modelling of Large Scale 
Systems with Multi-Time-Scale Property, "mt. J. Control, vol 48, No. 6, 1988. 
18. K. Trivedi and R. Geist, "Decomposition in Reliability Analysis of Fault-Tolerant 
Systems," IEEE Trans. on Reliability, vol. R-32, No. 5, December 1983. 
19. J. McGough, K. Smotherman and K. Trivedi, "The Conservativeness of Reliability 
Estimates Based on Instantaneous Coverage," IEEE Trans. on Computers, vol. C-34, 
No. 7, July 1985. 
20. A. Bobbio and K. Trivedi, "An Aggregation Technique for the Transient Analysis of 
Stiff Markov Chains," IEEE Trans. on Computers, vol. C-35, No. 9, September 
1986.. 
21. L. Segal and M. Slemrod, "The Quasi-Steady State Assumption: A Case Study in 
Perturbation," SIAM Review, vol. 31, No. 3, September 1989. 
22. J.T. Sims, "Performance Benchmarks for the VLSI - Fault Tolerant Processor," C.S. 
Draper Laboratory Internal Memorandum, February 1988. 
23. Dale A. Mackall and S.D. Ishmael, "Qualifications of the Flight Critical AFTI/F-16 
Digital Flight Control System," The 21st Aerospace Sciences Meeting, AIAA-83-
0063, Reno, Nevada, January 1983. 
24. Dale A. Mackall, "AIFTI/F-16 Digital Right Control System Experience," First Annual 
NASA Aircraft Controls Workshop, NASA Langley Research Center, Hampton, 
Virginia, October 1983.
7-2
C 
C 
C) 
C 
z
-
0 I-S 4-S 
E 
0 
S 
CL  
S 4-
S 
Cl)
S S 4-
.0 
L. 4-4-4 
4-
C 
S 
C 
0 
a. 
E 0 C) 
4
S 
•
-.
000 
'.c W W W 
ID	 .
0 0 0 000 
U
- 
I-u 0)0)0) S 
S 4-0 
• 000 
C - 
S >.0 W W W 
C -..000 000  000 
0)0)0) 
04-
000 
- .. .0 000 
0.5 5 0) 0) C) 
000 
0. 
S 
— CV)	 LO 
a_ 999 
- www 5 - N- DO C L. CD co 0 S m- 
-
CD C'J 0 
U. 
-S 
C ca &t)CD 
•— 999 C 
E L. N-0O co CD 0 
CD CJ 0 
5= 
0.0 
U-
_C 
•S 
4-
'-	 0 0O) CC 
C o OCCl) U 
.CE os.! 
a-I-U-
1. S 
a. 0 1. 
I.. 0 
0 
V 
-CI
S 4-S 
E 0 
S 
a. 
>1 4-
.0 S 
S
c'J 
Ott) 
tuioOwOo 0o° t o .- 
v0	 0 0 O	 0 
-
2QIQIQ 
OD Cl) 
-- 
000 v.ov 000 
.o.o. 
,-I-I- 
000
O 0 
I Cj
U) 0 00+ 
0.009
	
LO
— 
000 00 000 
0 0 co M 0oCtc 
-
.2 .2 .9	 0 
---W
	
0 • .9 -- .2 0 0 • ---° o 
000.0.
0. 0.0.0.	 0 
0m 0 m c6 m 0 m m	 tn %j 
ccc cc ca  
>.. > 
0.00 
.c0 .cc 
0.0 0.00 
a>	 a>... 0g)__ a>2-. a>	 a>-. 
g.ou> CD JC 00u) EV_ i_0.CV CO) ! 0Q00 
.0
0 00 OQ0O 
U.	 C	 U. — 0 C 0 LL. •• _ — 0 0 .c
Xr	 0 U. x
	
— 0 a> - a>o.g>_ 0-
6 'Lem
a> -	 .
a>E 
cc/)u. U.D1flCl)LL 
-
I-0 
o 
0 Cl) .0 0 0
0-0 
0.0 0 
E -' 
C.) 
- .-
0 0	 0 
LO 00000 U) 00 U)• 0 0 C')	 o 
C' 00 N- U)
0 
0 
0
co 
0 
0c'J 
0 
0 
U).-
0 0 
C 
0( 
CL LL 
—CU 
— 
C 0 U,0.0 
'0 .— — 
> —t w 
I CL 
.9— C-C 
>s — 
CU 0 CU = 
as 
C 
0 
U 
0 4 
.0 
U 
C 
C 
-J 
C 
0 
C4
CD 0 
'- 0 0 
-0 
000 
CD 0 
0
U) 
CtU) 
C 
0 
—'u 
0  
CD ox 
Eoc 
0.0 
CO 
o — 
.2' U) c 
U) C 
CD 0 
co
0 E 
—
.2 
co 
cr-
-
=
0. 
— 
.0 0 
0 
— . 
CL 
CD 0) Cl 
E 
-
C 
0 
0 0 
cI,cI, 
=1 c 
>>Q
U, 
.9 0 
U 
. 2 
0 
.! 
z
0 
Z5 
>
I. 
— 
— 4 
—
'D 
C 
2 
• 
— U. E o. 
d ui
0 
0 
U-
E 
cn	 I-. U. r	 0 
0w--- 
— C U) U) 
Eo 
U, 
E	 cc Zo 
w	
• • 
0.E' 0
U,>
0.0 0i)O
.0 
I.-
4 
E 0 
0 
(I) 
0 U 
C 
0 
4
0 0 
.0 
I-
as 4 
0 
U 
.0 0 
> 
U) 
U) 
• 00000 
— U) 1.CWWWWW C c - 0 0 0 0 0 
• ) — 0 0 0 0 0 
S 
C 
— U) C 
• 00000 C 
U) I__C +++++ WWWWW 
E • 0 0 0 0 0 b. — 00000  
C
) C—
0 0 0 0 0 
— . 0 0 0 0 0 
co C5 C> C> 0 0 
00000 
C 
• 99900 — 
_CWWWLL 
5 C)C> 0 o 0 
U) — LflOLt)00 I_= 
i—U) .--.00 U. 
S 
• 00000 C 
U) — .CWWWWW ++ 
E S	 .
 
0>0 0 00 
L. — tt)0')00 
•= .--- GO dd 
us 
•C 0>0) 
C
Eo0 
CE 
Qo
o. 
C.) 
c'J 
CM CJ C\J V 0 
00 wco 0	 10 
(D 0) OQ.0 U)	 It') csJ
.E Go	 C6 69 2 e U. 
a> 
a> 0 a>o a, CDCD 000 
to- t ww w t 
a, 0 a,c' > a, OWO 
— 
i CO CO 
-c 
C 
a,
C 
U — C 
— U F 
E 
m 
Cl) — cc - 
S 
 
.2 0 - E — 
> 
<
D) 
.
5 1
a..! 
1- U.00c
. = CD coc 6
.;; 0 
o 0 V 
° —
— 0
• W A' 
 00 - 0 p () E
F 
ci) 
0 
0 C.)
•0I C DIN- N. 
o 
o aICJ U)
U) 
0 
(1) 
0 C 
0o 
LO	 C, N-	 CM 
.-
- N-	 - 
.- 
1
tO 
.- 
(0 
- 
.
0 C) (00 
. 
CC'J (3) 
co o c..J
C\J 
— 
-	 -
LO 
-
0 C) 0 
_W C'J	 —
(0
-
CO
- 
Ob *e 
-J
-. 
0
h- 
0
•
0 
C 
C
C 
C
C 
C 
C 
c
C 
c
C 
( 
.- C1 ('4 
C
0 
(1)
0 
U)
0 
U) 
(D C 
C	 .(D 
C C 
C 
c/) Cl) Cl) 
cc 
a.2 ....
Cj C') 
— 
O
0 
LI..
0 o U) 
C 
cc cu 
—cut 
gutU..tu..t 
C	 0 
LL 0 z
c'J (0 000 U)	
.00 0 — C)	 C•4 14) 
o t9t 
—J 
0) 
c'J 
cJ 
OD
U) 
0 
0 
U, C 
-C 
C) 
0 
C 
0 
cli 
C) 
0 
U) 
U) CO 
cc 0 
LL CD 
CD
LL LI) 
-I--( C00 
cc 
U 0 
LL z 
4 
C 
U 
C 
U I-
C
N 
C 
U 
C 
U I-4 
.0 
I 
0 
0 
C.)
0 
U 
.0 
C 
C 
0 
N
Cd
U C 
-J 
C 
I-
C 
0 
U 
C.) 
0 
C 
E 
C 
C 
0 
0) I-
C 
C 
E 
C 
1 
— 
0 
i. 
C,, 
c'Jo 
o 
0
o
le	 MIV, 
o.o 
tc'J 
0
w cw 
0 6 CS	 cn iE CD 
0 0... I-
- U-
0 
— 0 
(D
o
••C') cJ LO v 
•0 000 
tc'J (0 www U) - -CJ 
0 0CJ0 0
N C') 0)	 C,) co OD 
c. 0	 •-•- -
U) C 
-c 
C-) 
> 
C.) 
=
— C 
o 0 0 
C.) 0- 0 
- 
. — 
-
1
0 
C C" 
CD 0 
=
'- C) 
C — CL 2 
C 
—
•=- C5O 
E
Oo 
2 '- 0 .0
U 
i 
0 0 c tv
U 
C
< ca 
LL o CD 
>0.0 
oW0< 
C.) 0 
-J
cc JD
•-to
C 
0 
. 00000 
•
00000 
— 
C .0 w w w w w 
CO-00000 
0 > — 0 0 0 0 0 
C 
cr 
C 
— C 
• 
C 
C ,.CWWWWW 
00000 
+++++ 
Em-00000 
— 0 0 0 0 0 
o,)o)co 
cc
0 0 0 0 0 
— L. .a 00000  
0-C 0 0) 0) 0) 0 0 
00000 
C
00000 
•
00000 
— 
U)
— 
.CWWWWW 
C . — 0 0 0 0 0 
C 00000 
-= 
CM
00000 
U. 
OM-00000 
c'cuoo 
C 
O .CWWWWW 
 ++ 
E 1.-. N C') C') 00 
I- — • 0 0 
00 
U. 
C 
•
•-	 0 
0>0) 
C
0
 Eoo 
.CE 
C.)o
CD 0. CD 
C.) -
Cl) 
0 
C) 
U) 
0 
C)
VI	 C'J CICD C'J 
0 (7) c' 
o ....ICDC')w 
CD o 0i' 
cx 61+ 60 C')
V 
C —	 •O 
0)
cli 
o" ON- 0) C').. 
•.-	 N-
!?	 rT CD 
, f — 
Di 
CD 
ci 
OI 'it N-N- N-
I c ) - C') CD 
81 e3 ,
() 
0 
C) 
(I) 
0 
()
- 
V	 CM
C CD 0 . 
CD C') 
0	 CD C') CD 
C)	 CD l) N-

(M CD 
o v	 C') 
CD 
CD 
N- N o 00
'' . 
CD SC') C') It 
_j
V CCC' .—	
—
co 
0C\J OLfl
—
0C') 
-	 0) C').,. 
(D 04
0) N-
-
b 
_I 
0 
C.)
0 
.0 
.	 0 
I. = 
C
2 
— 4 0 
0
U) 
c 
CU 
C) 
0 J 
o 0 0.	 C.) 
C.'J 
0 
cc C') (I) V (D 0 CU — _l 
00 
Ou. 
CD 
to LI.. 
CD 
c.2 
(U 
— 0 0 0 U) 
-
c c c 
c 
co
C 
co
C (U 
c 
>	 C.)
c 
C) c C.) 
2	 — c.j cj 
0 
.	 0 b0 I.-0 
co 
co	 0 
C	 i) 0 U) 0 Cl) 
C I C 0 C (D 
C/) Cl) 
CC'J CsJ 
—
C') 
— 
cc 0 Cl)	 0 0 
V —	 — — (U 
LL 
0 
—05CCU 
= 
U..tLL 
=
0 
0 0 CC 
LL
0 0 
0
0 
Z
— 0
b
0 0 
U) Cl) 
0 
c
0 C 0 C 
C CU
c CU C CU 
o C) C) 
.— c,.J CJ 
o 0 0 
o Cl) o Cl) 0 (l) C 
0
C 
0
c 
0 
C/) CO Cl) 
CY C..J C') 
0 0 0 
'- . 0 
-D	 — (U
'— 0 
— (U
'— 0 
(U = = =
CL 
cc 
0 0 0 
2
1. 
U 
2 
4
0 
0 
•0 
-c 
U 
4 
CM
•oI(D cJ CM do 0) 
SI,c CD C) 
U, i. (5tS 
o	 ICD
	
to 
0Icj-o 
-. .-
Ift 69
(0 0 
0 
D CD 0 C.J 
C ,- CO 
CD It) C') 
IM 4m co Ln 
U)	 It) 
CD C\J 
.- .-
_J
(I) 0 
0 
C 
ca 
-c 
0 
0 
C 
0 
0 
c'J 
0 
61)
Cl)
00 
650 U-0 
LL 
•	 0
c-.1 
0 
a-U 0 a-
U 
I-
4 
Cd
0 
a-U 0 a-
U 
I-
4 
4
a-
E 
0 a-0 
C') 
a-0 
0 0 
E 0 
z 
U) 
0 
U 
C 
0 
4 
0 a-
0
-
-
-
- 
>
•O 
0 
DL() 
o 
D 
•0 C(D
,- 
N-	 0) 
CD 
CI 
It)
CD  0) 
O) 
— 
co 
IT 
,
CD 
O 
(!)C') 
C')
	
(Y) 
(Do 
c%J 
60 
C\i	 It) LO — 
Ln	 ..- 0) O 
CO
a o 
CD	 CD .-
sc') C.J CD CO 
0
LO	 10) 03,b v 
- 
0
0 0 
to 
0 C 0 C 0 C C 
65
C 
(5
C (0 
> 0 0 0 
.- c'J c\J 
o 
0.
L
0
I
0 0 
I- U) (I) Cl) 
• 
(5 
C
I.. 
0 
Cl)
- 
0 
U)
0 
Cl) 
0 
cc
C 
w
C 
CD
C 
0 
C/) co 
cis CM 
- •.-
CM 
—
C') 
— 
0
co
0 0
to tv cc 
LLLLU- t
APPENDIX B 
MODEL REDUCTION TECHNIQUE FOR FTP ANALYSIS 
B. 1 Introduction 
There is an increasing demand for fault-tolerant control systems in high 
performance, critical applications. At the core of such systems, there is one or more fault 
tolerant processors (FTP's). The FFP receives information from sensors, sends 
commands to actuators and performs redundancy management. Given its critical role in 
any control system, the operation and performance of the FTP must be very carefully 
modeled if the safety and performance evaluation process is to be relied, upon. 
An FTP may contain computing elements, dedicated and/or shared memories, 
information replicating components, I/O-dedicated electronics, etc. The key feature of an 
FTP is the ability to handle faults in a controlled, timely manner that enables correct 
reconfiguration and uninterrupted (on the time scale of interest) operation. An accurate 
model must be able to describe in detail rate processes such as component failures, fault 
detection, fault isolation and reconfiguration. 
The Markov modeling method has clearly emerged as the preferred approach with 
regard to the analysis of such processing systems. The ability of this approach to. correctly 
capture rate processes and event sequence dependencies are of crucial importance. While 
discrete simulation approaches might be perfectly adequate from the point of view of 
modeling flexibility, computational efficiency strongly tilts the balance towards the 
Markovian approach. This is so because, for such highly reliable systems, the component 
failure rates are very low. Consequently, the fault occurrence event rate is so low as to 
require a prohibitively large number of trials in order to accumulate statistically meaningful 
and reasonably accurate results. 
The Markov method however suffers from a major drawback. The number of 
states proliferate rapidly, often leading to an intractably large model. The FTP proper 
represents a system of moderate size, such that even a rather detailed model does not 
generally pose a major problem. However, when attempting to analyze the entire control 
system which the FTP is part of, maintaining the level of detail desirable for the FTP alone 
would most likely give rise to an unwieldy, perhaps even intractable model. Techniques 
such as aggregation, truncation and decomposition are used to mitigate this basic difficulty. 
Still, it would be highly desirable to devise a simplified model of the FTP which would 
greatly contribute to alleviating the space explosion problem while correctly preserving the 
main features of the detailed model. 
The purpose of this appendix is to describe the development of such an approximate 
model. First a detailed Markov model of a triply redundant FTP is introduced. The model 
reduction technique is then presented, leading to an excellent approximate model for this 
B-i
FTP. A complete analytical solution of the reduced model follows. To get a feel for the 
approximations involved, the procedure is shown applied to a simple example, for which 
analytical solutions are feasible for both the exact and reduced models. The appendix ends 
with a few concluding remarks regarding the reduction technique to be presented. 
B. 2  Detailed Triplex FTP Markov Model 
A detailed Markov model for a Quad FTP is described in Section 4 of [1]. The 
model tracks separately, within a computational channel, the processor element, the 
associated dedicated memory and the corresponding interstage. Both permanent and 
transient failures are accounted for, along with the appropriate reconfiguration mechanism. 
Provisions are made for including common mode failures as well. This model provides a 
realistic paradigm of the fault occurrence/fault handling processes. 
The cost analysis presented in Section 5 is focused on a comparison of two triplex 
architectures, one of which is based on an AlPS triplex FTP. The approach used in 
constructing the quad model was used to generate a similar triplex Markov model. In order 
to simplify the description of the model reduction method, some additional assumptions 
were made: 
- the processor and its associated memory in one channel were treated as one 
component, 
- the triplex coverage was assumed perfect and 
- common mode failure was disregarded. 
The Markov model for the triplex FTP, based on these assumptions, is shown in 
Figure B-i. The notation used for the state transition rates is the following: 
- ?.ab (,sl) is the failure rate for component a (where a = p for the processor and a = i 
for the interstage) from configuration n (where n = t, d or s, i.e., triplex, dual or 
single, respectively); b indicates the type of failure, (with b = t or p denoting 
transient or permanent failure); si stands for system loss. 
- Pb is the rate of the reconfiguration process initiated by a b-type failure of 
component a starting from configuration n; here a, b and n have the same meaning 
as above. 
State 1 represents operation with no failures. States 6 and 11 correspond to 
degraded modes of operation, namely operation with one channel failed and two channels 
failed, respectively. Finally state 12 denotes an aggregated system loss condition. This 
state is reached either as a result of incorrect reconfiguration or because of exhaustion. 
B-2
B-3 
0 
bO 0 
v0 
O0 
Cd	 ' 
>10 
0
0x 
o
0 
. 0 
0
CL 
0 
C 
C 
•0
The group comprising states 2,3,4 and 5 is associated with the state of the system 
after one failure and the group including states 7,8,9 and 10 corresponds to the system after 
two failures. The states in these two groups are characterized by extremely short holding 
times, when compared to states 1, 6, 11 or 12. Specifically, the "fast" states represent 
intermediate configurations, persisting for only very short periods of time during the 
reconfiguration processes. The reader should note that both permanent and transient 
failures are accounted for, with a distinct reconfiguration path. Specifically, transient 
failures, i.e., states 2, 5, 7 and 10, are reconfigured back to their respective origin states, 
i.e., states 1 and 6. In contrast, permanent failures, i.e., states 3, 4, 8 and 9, lead to 
reconfigurations to the appropriate degraded operational modes, states 6 and 11. 
This Markov model captures the key fault occurrence and handling processes in a 
triplex FTP, designed to withstand Byzantine faults. The actual model used in the cost 
analysis contains additional details, which makes it a realistic tool for studying the 
performance of an FTP processor. 
It is perfectly feasible to use this model to analyze the FTP on a stand alone basis. 
Still, care must be taken in the solution technique to overcome difficulties caused by the 
very pronounced stiffness of the resulting system of ordinary differential equations. 
Indeed, there is an enormous discrepancy between the time constants characterizing the 
failure events and those associated with reconfiguration processes. 
When the FTP must be analyzed as a subsystem within a much larger control 
system, the high level of detail in the model becomes a liability. This is so both because of 
the large state space the analyst will have to deal with and in view of the stiffness aspect 
mentioned above. It is thus natural to search for an approximation technique allowing a 
high level of fidelity, while providing a much more tractable model to work with. Such a 
technique will be described in the next section. 
B.3 FTP Model Reduction 
As already mentioned, the detailed Markov model has a number of states 
characterized by a very short time constant relative to the time scale of the failure events. 
This situation immediately suggests the possibility of a behavioral decomposition. The 
existence of distinct time scales is often encountered in control system applications and is 
exploited to generate a reduced model of the original system. Reference 17 presents a good 
review of this approach in the system control area. The same basic idea is also often used 
in simplifying various physical models (see, for example, [171). These reduction 
techniques often rely on detailed eigenvalue analysis and consequently are rather 
cumbersome. The reduction techniques become considerably more appealing when it is 
obvious which states are characterized by fast time constants and which ones are "slow". 
B-4
In the reliability analysis field, the need to model both the fault occurrence (slow) 
and the fault handling (fast) processes leads naturally to the situation previously described. 
There is a strong incentive to perform a systematic behavioral (or temporal) decomposition 
in order to both reduce the size of the state space and also remove the severe stiffness of the 
mathematical model. References [18] and [19] propose a decomposition approach. While 
the approach is well founded, it is quite impractical for complex applications because of 
some rather cumbersome probabilistic arguments used to determine aggregated transition 
rates. In [20] Bobbio suggests another approach, similar to that used in [21], which leads 
to a systematic and straightforward reduction model reduction procedure. The technique 
does not use a formal eigenvalue analysis, relying solely on an examination of the original, 
detailed model structure and transition rates. This technique will be applied to obtain a 
reduced FTP model. 
The state transition rates may clearly be divided into two separate sets, one 
consisting of the slow rates, i.e., the failure rates, the other consisting of the fast rates, i.e., 
the reconfiguration rates. We can then partition the n states of the model into two disjoint 
and exhaustive subsets defined as follows: 
- [S] is the set of nS
 
slow states (1, 6, 11 and 12), i.e., states with no outgoing 
transitions classified as fast, 
- [F] is the set of nFfasr states, i.e., states with at least one fast outgoing transition. 
For convenience, we further subdivide the set [F] into the subsets [Fl] and [F2], 
corresponding to the fast states reached following a single failure (2, 3, 4 and 5) and two 
failures (7, 8, 9 and 10), respectively. The transition matrix and the associated probability 
vector are then reordered such that the equations governing the states in [S] become the first 
ns equations, followed by the nF 1 equ:. oflS corresponding to the states in [Fl] and the n 
equations corresponding to the states in [F2], with nF = flFl + n. 
After this reordering, the equations describing the Markov model can be written as: 
r PS
	
1 D0 B01 B02 IP 
=B10 D1 0 I 
 
B20 0 D2 L PF2 j 
where:
[-A4 0
	
0 
=	 0	 6 0	 0	
D1 = diag( 62, 83 , 84 , -65) 
I	
D2 = diag(	 -8, -, -öio) 
	
0	 0 -A11 .o 
	
1 0	 0 +	 0 ]	
Omh1mat1
(B-i) 
B-5
S. e 
P21 P31 P41 P51 0	 0 0 0 
B01 =
P26 P36 P46 P56 B02 =
P76	 P86 P96 P10,6 
0 0 0 0 P7,11	 P8,11 P9,11 Pio,ii 
A'2 A'3 A'4 ?5 ?7	 A'8 A'9 
A' 12 0 0 0 0	 A'67	 0 0 
A' 1 3 0 0 0 0	 A'68	 0 0 
B10= B20= 
A' 14 0 ° 0 0	 49	 0 0 
A' 15 0 0 0 0	 A'jo	 0 0
For facility, the notation follows the convention indicated in Figure B-2. The total 
outgoing transition rate from the fast state "k" is denoted 8k and is given by: 
= Ph + Pkj + A'k 
while the total outgoing transition rate from slow state "i" is denoted Xi and is given by: 
A'i=A'ik 
where the summation is implied over all the fast states "fed" by slow state "i". 
fast state 
system loss 
Figure B-2 Notation Convention 
B-6 
At this point, the key approximation is made that the fast states reach their steady 
state well within the time scale of interest, i.e., mission time. In other words, the fast 
states are assumed to respond instantaneously to changes in the slow states. Setting the 
temporal derivatives of the fast states' probabilities to zero leads to the following 
approximate expressions for the probability subvectors PFI and PF: 
PFI = -D'B i0P and F2 = -D21 B 20P	 (B-2) 
These expressions for PF1 and PF2 are used to eliminate them in favor of PS in the 
first nS equations, leading to the following reduced system of equations for Ps: 
Ps = {D0 - B01D'B 10 - Bo2D'B2o] P = A P	 (B-3) 
It should be noted that the algebraic manipulations implied in (B-2) and (B-3) are 
particularly easy to carry out in our application because of the specific structure of the 
Markov model. Indeed, the submatrices D1 and D2 are strictly diagonal, making their 
inversion trivial. The significance of their strictly diagonal structure is that no direct 
coupling exists among the fast states. Moreover, B10 and B20 are very sparse, further 
reducing the computational effort required to obtain the approximate model equations. 
The transition matrix of the reduced model has the following structure: 
Fa 1 0 0	 ol 
As
a 1 a22 0	 0 
0 
[a i
a;2 a;3	 0 
a 2 a 3	 0]
where:
(B-4) 
5	 10 
al 1 = -	 (P k6 + ?.k), a 2 = -	 (pi.ii + 
k=2 8k	 k=7 
5	 10 
=	 1k Pk6 a32
* 
= a 1	 Pk,11 
	
k=2 8k 	 k=7 
5	 10 
'k a42= 
k=2 8k	 k=7 
	
A33	 X11, a3=X11.
(B-5a) 
(B-5b) 
(B-5c) 
(B-5d) 
B-7 
The reduced formulation is thus fully defined in terms of the original model 
parameters. A more insightful and convenient, but still fully equivalent formulation, that 
clearly identifies the successful as well as the failed reconfigurations, can be obtained by 
rewriting the reduced model transition matrix in the following form: 
	
a ll	 0	 0	 0 
[
	
	 I	 -6)
a2 
As —	
0	 0 
	
0	 cda2	 a3	 0	 I 
[ict a i (1cd)a 2 	 -a 3 	 0	 ] 
Here, the equivalent triplex and duplex coverages are obtained by simply comparing 
formulations (B-4) and (B-6): 
5	 10
Xlk	
-Pk6	 I a Pk, 11 
Ct	 k=2	 and c= 
10 
k=7 1k
	 (B-7) 
66 + k)	 Ya k6k 6,11 + Xk) 
k=2 8k	 k=7 
A few remarks are in order at this point. Since it was assumed that the triplex 
coverage (i.e., the detection and the isolation) is perfect, a system loss can be caused only 
by a coincident failure. Since the reconfiguration rate is many orders of magnitude greater 
than the failure rate, the equivalent triplex coverage is very nearly 1.0. In contrast, the 
situation for. the dual operation, reached after a successful recovery following a single 
failure, is quite different. Here, the detection is still assumed perfect, but for isolation we 
must rely on self-test, which is assigned a probability of success Cd,jsol < 1.0. 
Consequently, a transition to systemloss may take place not only because of a coincident 
failure but also predominantly because of improper reconfiguration. As a result, the 
equivalent duplex coverage may be significantly less than 1.0. 
A simple example will further clarify this important aspect. Disregarding transient 
failures and treating the processor/interstage set as one "component" with a failure rate X 
and a reconfiguration rate p, we have, from (B-7): 
3A. 
p + 2?  
= 1 ct =	 (B-8) 
3?. (p+2A) 1+2k 
p+2A	 p
B-8
2X Cd,jsolP 
CdjsolP + (lCd,jsol)P + ?L	 Ccj,jsol 
______________________________ - 	 (B-9) 
2?.	 [cd,j1p + (lCd,jj)P + ]
	
1 + A. 
Cd,jsolP + ( l-cci, jso i)p + ?	 1 
Since ? << p, then c 1.0 and	 Cd,jsol. The expressions (B-i) properly reduce 
to the simple model often used to represent a triplex FTP. 
The reduced model obtained in this section is illustrated in Figure B-3. The three 
operating states (1, 2 and 3) in the reduced model correspond on a one-to-one basis to the 
operating states (1, 6 and 11) in the original model. The transition rates used in this model 
are given by the expressions (B-5) and and (B-7). From the initial n-state model (n = 12), 
an approximate model containing only ns states (ns = 4) has been obtained. Extensive 
numerical experimentation, comparing the original model with the reduced one, has 
consistently indicated an excellent agreement, proving the validity of this temporal 
decomposition technique in our application. 
The simplicity of the reduced model allows a fully analytical solution, which will be 
introduced in the next section.
B-9
no failures	 one failure	 two failures 
a.	
?Ct .......... 
............. 
system loss 
Figure B-3 Reduced Markov Model of a Triplex FTP 
B. 4  Analytical Solution of the Reduced FTP Model 
The simple, non-cyclic structure of the reduced model allows a compact, analytical 
solution. The reduced model is basically a chain, with additional transitions to system loss 
before component exhaustion. It can be easily shown that the solution to the model 
depicted in Figure B-3 is:
Pi = et	 (B-10a) 
P2 = Atct [e-)J't - e?t]	 (B-10b)
td 
(jt)( dcci)	 [(,,d,$) et - (Xt .5 ) et +	 e4.St] (B-lOc)P3 = (td) (ds) (t,$) 
The probabilities correspond to the three operational modes postulated for the 
triplex FTP, i.e., operation with no failures, with one failure and with two failures, 
respectively. For the particular case when ?' = 3k, d = 2? and X s = ?, these formulas 
-'
B-lO
take on an especially simple, compact form, i.e., a binomial formula modified to account 
for imperfect coverage, 
P1 =R	 (B-ha) 
P2 =300(1Ro) (B-11b) 
P3 = 3 ctcd p ,(1 - Ro)2 	 (B-11c) 
where R0 = exp(-.t). In these formulas, the appearance of the coverage 
probabilities account for the obvious need for successful reconfiguration if the FTP is to 
continue to operate in a degraded mode. 
This analytical formulation is a powerful tool for carrying out extensive parametric 
studies. The formulation can be easily adapted to a different type of FTP, for example a 
quad configuration. 
B.5 Exact and Reduced Models for a Simple Example 
It is instructive to examine a simple example to reveal the exact nature of the 
approximations involved in carrying out the model order reduction procedure outlined up to 
this point. A simple example will further clarify this important aspect. Let us consider a 
dual FTP, treating a processor/interstage set as one component, subject to both transient 
and permanent failures occurring at the rates X t and Xp, respectively: The same 
reconfiguration rate, p, is assumed for both transient and permanent failures. The self-test 
coverage, accounting for the imperfect isolation characteristic of the dual architecture, is 
denoted by c. In view of the much greater rate of the reconfiguration process compared to 
the rate of failure events, the effect of a coincident failure will be disregarded. The Markov 
model incorporating all these assumptions is illustrated in Figure B-4. This model can be 
solved analytically to yield the following expression for the probability of system loss: 
	
pact = 1 - A eS i t - Be S2t - C eXt	 (B-12) 
where the coefficients and the exponents are given by: 
	
A = [s2 - 2p (1-c)] B = ? [s i - 2p (1-c)]	 =	 2cA.p 
(s i
 - S2)(5 1 - A.)	 (si - S2) (S2 - A.)	 - A.) (s2 - A.) 
SI and S2 are the roots of: s2 + (2A. + p) s + 2p [Ap + (1-c)A.j =0 
and A. is the total failure rate, i.e., A. = A. + A.. 
B-li
Applying the procedure previously outlined, the initial model is reduced to the 
approximate model shown in Figure B-5. The probability of system loss for this model is 
given by: 
pappTox	 e 
	
= 1 -	 2{p + ( 1c)A ]t.. C e-t	 (B-13) 
where
% ^ 
	
1-2c	 ____ 
12c(?'L	 12c(L) 
	
X	 A. 
It can be easily shown that if A. << p, then the roots of the quadratic are well 
approximated by: 
S2 - 2 [ + (1-c)A.] and Si P (B-14) 
Substituting (B-14) into the expression of the coefficients in (B-12) leads to the 
conclusion that: 
A —*O, B—B and C-4C, 
as ()L/p) —30 
In addition, it is clear that the first exponential will decay very rapidly compared to 
the other two. Consequently, for any length of time sufficiently in excess of the time scale 
of the reconfiguration process (i.e., lip), the approximate solution, (B-12), is in excellent 
agreement with the exact solution, (B-13).
B-12
no failures	 successful recovery	 system loss caused by 
(duplex operation)	 after one permanent failure 	 incorrect reconfiguration 
(simplex operation)	 or exhaustion 
Figure B-4 Markov Model of a Dual FTP 
no failures	 one failure 
IIII	
2c?
system loss 
Figure B-5 Reduced Markov Model of a Dual FTP 
B-13
It is interesting to note the expression of the equivalent dual coverage for this 
example. From Figure B-5 and according to equations (B-7), this effective coverage is 
given by: 
cd=	 2c?	 =	 C 
2(1-c)?.+2c?	 1+(l-c)- (B-15) 
XP 
The effective coverage is equal to the self-test coverage when the transient failures 
are disregarded. It decreases monotonically as the ratio of transient - to - permanent 
failures increases. This result is quite general, in spite of the simple model used to illustrate 
it. 
B.6 Conclusions 
A methodology enabling a systematic reduction of a complex FTP model has been 
developed. The technique relies solely on an examination of the structure of the transition 
matrix, with no eigenvalue analysis and coordinate transformation necessary. The reduced 
model captures all the relevant features of the detailed model and represents an excellent 
approximation. The simplicity of the reduced model allows a fully analytical solution, 
which is extremely effective especially when extensive trade studies are required. 
The technique is illustrated on a simple but instructive example, for which an 
analytical solution is possible for both the exact and the reduced models. The high quality 
of the approximation is clearly shown. In addition, the crucial impact of the transient 
failures on the success of the reconfiguration process is demonstrated. 
The methodology and the results presented herein provide considerable insight 
regarding the difficulties and the subtleties involved in the rigorous reliability and 
performance analysis of an FTP.
B-14
Report Documentation Page 
ScXe 
1. Report No. 2. Government Accession No. 3. Recipient's Catalog No. 
NASA CR-187554 
4. Title and Subtitle 5. Report Date 
Advanced Information Processing System for Advanced September 1991 
Launch System:	 Avionics Architecture Synthesis
6. Performing Organization Code 
7. Author(s) 8. Performing Organization Report No. 
Jaynarayan H. Lala, Richard E. Harper, Kenneth R. 
Jaskowiak, Gene Rosch, Linda S. Alger, Andrei L. Schor
__________________ 
10. Work Unit No. 
506-46-21-56 9. Performing Organization Name and Address 
The Charles Stark Draper Laboratory, Inc. 11. Contract or Grant No. 
Cambridge, MA 02139 NAS1-18565 
13. Type of Report and Period Covered 
Contractor Report 12. Sponsoring Agency Name and Address National Aeronautics and Space Administration 
Langley Research Center 14. Sponsoring Agency Code 
Hampton, VA	 23665-5225 
15. Supplementary Notes 
Final Report 
Technical Monitor:	 Felix L. Pitts 
16. Abstract 
The Advanced Information Processing System (AlPS) is a fault-tolerant distributed 
computer system architecture which has been developed to meet the real time 
computational needs of advanced aerospace vehicles. One such vehicle is the 
Advanced Launch System (ALS) being developed jointly by the National Aeronautics 
and Space Administration and the Department of Defense to launch heavy payloads 
into low earth orbit at one tenth the cost (per pound of payload) of the current launch 
vehicles. An avionics architecture that utilizes the AlPS hardware and software 
building blocks has been synthesized for ALS. This report describes the AlPS for ALS 
architecture synthesis process starting with the ALS mission requirements and ending 
with an analysis of the candidate ALS avionics architecture. 
17. Key Words (Suggested by Author(s)) 18. Distribution Statement 
Advanced Launch System 
Fault-Tolerant Avionics Architecture Unclassified-Unlimited 
Reliability 
Redundant	 igital Computers Subject Category 62 
Distributed Processors 
19. Security Classif. (of this report) 20. Security Classif. (of this page) 21. No. of pages 22. Price 
Unclassified Unclassified 158
NASA FORM 1626 OCT 86
