Airborne Advanced Reconfigurable Computer System (ARCS) by Bjurman, B. E. et al.
_J 8-1/2 X 11 INCH CROPS
NASA CR-145024
AIRBORNE ADVANCED
IRECONFIGURABLE COMPUTER SYSTEM
; (ARCS)
B. E. Bjurman, G. M. Jenkins, C. J. Masreliez, K. L. McClellan, and J. E. Templeman
/ /
(NASA-CR-1«502l») AIKEOHNE ADVANCED N76-30865
:
 FECONFIGOPABLE COMPUTER SYSTEM (!\KCS) Final
I . Report, Mar. 1975 - Apr. 1976 (Boeing
Commercial Airplane Co., Seattle) 547 p HC Unclas
$13.00 CSCL 09E G3/62 U87U8
; August 1976
Prepared under contract N ASM 3654 by
Boeing Commercial Airplane Company
P.O. Box 3707
Seattle, Washington 98124
for
Langley Research Center
N ATI 0 N AL_AE RO_N AUTICS AMD SPACE ADMINISTRATI ON
Hampton, Virginia 23665
SFP
RECEIVED
N.«A STJ FACILjy «
INPUT BRANCH "
https://ntrs.nasa.gov/search.jsp?R=19760023777 2020-03-22T13:14:27+00:00Z
I. Report No. 2. Government Accession No.
NASA CR-1 45024
4. Title and Subtitle
AIRBORNE ADVANCED RECONFIGURABLE COMPUTER
SYSTEM (ARCS)
7. Authors)
 B E Bjurmanj G. M. Jenkins, C. J. Masreliez,
K. L. McClellan, and J. E. Templeman
9. Performing Organization Name and Address
Boeing Commercial Airplane Company
B.O. Box 3707
Seattle, Washington 98124
12. Sponsoring Agency Name and Address
National Aeronautics and Space Administration
Langley Research Center
Hampton. Virginia 23665
3. Recipient's Catalog No.
5. Report Date
August 1976
6. Performing Organization Code
8. Performing Organization Report No.
D642476
10. Work U n i t No.
11. Contract or Grant No.
NAS1 -13654
13. Type of Report and Period Covered
Final Report - March 1975
through April 1976
14. Sponsoring Agency Code
15. Supplementary Notes
ARCS technical monitor: Mr. A. O. Lupton FID/NASA LRC
16. Abstract
Fault-tolerant electronic subsystems are being applied to future commercial transports to
improve their range-pay load performance and to enhance their operational capability/reliability.
This 1-year study defined a new digital computer subsystem fault-tolerant concept and assessed
the potential benefits and costs of such a subsystem when used as the central element of a new
transport's flight control system. The derived Advanced Reconfigurable Computer System (ARCS
is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault
conditipns, from triplex to duplex to simplex operation, with redundancy recovery if the fault
condition is transient. The study included criteria development covering factors at the aircraft's
operational level that would influence the design of a fault-tolerant system for commercial
airline use.
A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system
availability and survivability; and a stringent digital system software design methodology was •
used to achieve design/implementation visibility.
17. Key Words (Suggested by Author(s) )
Airborne computer Redundant
Avionics Reliability analysis
Fault tolerant Software design
Reconfigurable
19. Security Classif. (of this report)
Unclassified
18. Distribution Statement
20. Security Classif. (of this page) 21. No. of Pages 22. Price"
Unclassified !535
"For sale by the National Technical Information Service, Springfield, Virginia 22151
! f
H
CONTENTS
1.0 SUMMARY . . . . . .'4^^ . . ' ,1
2.0 INTRODUCTION : : 4
2.1 Program Overview 4
2.2 Statement of Work Summary and TCV Program Relationship . 6
2.3 Report Organization 9
2.4 Acknowledgment 9
3.0 SYMBOLS AND ABBREVIATIONS 10
4.0 DESIGN CRITERIA 14
4.1 Operational Criteria : 14
4.1.1 Economic Considerations 16
4.1.2 Flight Safety Considerations 36
4.2 Arcs Design Requirements 43
4.2.1 System Functional Design 43
4.2.2 Software Design . . . 50
4.2.3 Hardware Design 50
5.0 ARCS DESIGN CONCEPT 55
5.1 Functional Description •-'. . . . 55
5.1.1 System Reconfiguration 55
5.1.2 Functional Organization 68
5.1.3 Sensor Signal Selection and Fault Detection (SSFD) . 75
5.1.4 System Self-Test 79
5.2 Software Design 82
5.2.1 Top-Down Design (Step 1) 83
5.2.2 Software Design Tree (Step 2) 83
5.2.3 Module Identification (Step 3) J85
5.2.4 Transition Diagram (Step 4) 86
5.2.5 Software Code (Step 5) - 89
5.3 Hardware Design 89
5.3.1 Hardware System Architecture 90
5.3.2 Hardware System Interfaces 94
5.3.3 Arcs Computer Unit 100
5.3.4 System Test Panel 116
6.0 ARCS DESIGN ANALYSIS
6.1 Fault-Tolerance Analysis 120
6.1.1 Fault Analysis 120
6.1.2 Reliability Analysis 131
6.2 Cost/Benefit Analysis 169
6.2.1 ARCS Availability Cost Effect 171
6.2.2 Acquisition Cost 174
111
CONTENTS (Concluded)
6.2.3 System Test Cost Effect 174
6.2.4 Cost/Benefit Summary 186
7.0 ARCS IMPLEMENTATION >
7.1 ARCS Design Specification 188
7.1.1 ARCS Real-Time Operations 188
7.1.2 Ground Test Operations 190
7.1.3 Asynchronous Operations 191
7.1.4 Redundancy Management 191
7.1.5 Hardware Design Requirements 193
7.2 Feasibility of WWCS Modification 194
7.2.1 WWCS-ARCS Comparison 195
7.2.2 WWCS Modifications 195
8.0 CONCLUDING SECTION 197
8.1 Summary of Results 197
8.1.1 ARCS Design Criteria 197
8.1.2 ARCS Design Concept 198
8.1.3 ARCS Design Analysis 199
8.2 Conclusions 201
APPENDIX A 203
APPENDIX B . 217
APPENDIX C 311
APPENDIX D 314
t
APPENDIX E 359
APPENDIX F 396
APPENDIX G . 4 3 7
APPENDIX H 479
APPENDIX I 519
REFERENCES 535
IV
TABLES
No. • Page
1 Power-On/Power-Fault Recovery Strategy 62
2 Functional Characteristics Summary of ARCS Processor 102
3 Recovery Strategy 125
4 Surveyed Reliability Programs and Associated Analysis Methods . . . 133
5 Reliability Program Features 133
6 Near-and Intermediate-Term CWS Failure Probabilities 150
7 Probability of No Module Failure 152
8 Autoland Availability/Failure Probabability Results for ARCS Near-Term
Baseline 154
9 Autoland Availability/Failure Probability Results for Intermediate-Term
Baseline . . 1 5 4
10 ARCS Baseline Far-Term Results . . '. 155
11 ARCS Voting Node Trade Study Results for the CWS Mode Application 15 8
12 Definition of Stage Numbers for the ARCS Near-Term Application . . 161
13 Two-Step Failure Transitions Leading to System Failure for the ARCS
Near-Term CWS Mode Application 162
14 ARCS Sensitivities for the Near-Term Application Model 164
15 Contribution of Computer Unreliability to System Failure Probability . 165
16 Weather-Caused Interruptions 173
17 Line Maintenance Activity With System Test Capability 177
18 Line Maintenance Activity Without System Test Capability 177
19 Shop Maintenance Operation 180
20 Sensor System Removals 183
21 Sensor Maintenance Parameters 183
22 Maintenance Costs Per Year . 186
23 Processing and Interface Requirements for ARCS 189
Bl Software Index 229
B2 Channel Identification 245
B3 Left and Right Channel Identification . 245
B4 Monitor Flag Meaning . 261
B5 Output Monitors • . . . . 263
B6 Monitor Trip Counters 264
B7 Permanent Failure Flags 268
El Simulated Hardware Indicators 360
E2 Synchronization Indicators 361
E3 Recovery Indicators for Computer A 363
E4 Recovery Indicators for Computer B 364
E5 Recovery Indicators for Computer C 365
E6 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer A 366
E7 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer B 367
1 TABLES (Continued)
No. Page
E8 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer C 368
E9 Input and Output for Computer A . 369
E10 Input and Output for Computer B 370
Ell Input and Output for Computer C 371
El2 Simulated Hardware Indicators 373
El3 Synchronization Indicators 374
E14 Recovery Indicators for Computer A 375
El5 Recovery Indicators for Computer B 376
El 6 Recovery Indicators for Computer C 377
El7 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer A 378
El8 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer B 379
E19 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer C 380
E20 Input and Output for Computer A 381
E21 Input and Output for Computer B • 382
E22 Input and Output for Computer C 383
E23 Simulated Hardware Indicators 385
E24 Synchronization Indicators . 386
E25 Recovery Indicators for Computer A 387
E26 Recovery Indicators for Computer B 388
E27 Recovery Indicators for Computer C 389
E28 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer A 390
E29 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer B 391
E30 Do Not Use, Permanent Flags and Counters for A, B, and C Signals for
Computer C 392
E31 Input and Output for Computer A 393
E32 Input and Output for Computer B 394
E33 Input and Output for Computer C 395
Fl CARSRA Input Data 421
F2 CARSRA Example Transition Rates/106Hrs 426
F3 CARSRA Input Example 427
F4 CARSRA Output Example 429
HI Sources Utilized for the MARKOV Model Parameter Assessment Task . . . . 480
H2 Permanent Fault Rates 481
H3 Normal Power Transients (Examples) 483
H4 Abnormal Power Transients (Examples) 483
H5 Flight Control Sensor Difference Standard Deviations 500
H6 ICPS SS/FD Thresholds and Time Delays 503
H7 Predicted ICPS Transient Sensor Failure Data in Turbulent Flight 506
VI
TABLES (Concluded)
No. Page
H8 Sensor Biases and Scale Factors .' 507
H9 ARCS Predicted Transient Sensor Failure Data in Turbulence 511
H10 Assessed Sensor Rate Ratios 517
vn
i FIGURES (Continued)No. ' Page
44 Watchdog Monitor Timing Diagram 112
45 Power Supply Block Diagram . . . - - • ' 114
46 System Test Panel Block Daigram 117
47 ARCS System Test Panel Layout 118
48 ARCS Synthesis/Analysis Scope 121
49 ARCS/WWCS Fault-Tolerance Analysis Overview 122
50 Local Computer's Assessment of a System Function 126
51 Power-On/Watchdog Monitor Trip Recovery as Processed by an Operating
Computer (Triplex Operation Attained) 127
52 Power-On and Watchdog Monitor Trip Recovery as Processed by an
Operating Computer (Duplex Operation Attained) 128
53 Power-On/Watchdog Monitor Trip Recovery as Processed by an
Operating Computer (Simplex Operation Attained) 129
54 Power-On/Watchdog Monitor Trip as Processed by an Operating Computer
(Triplex-to-Duplex Degradation) 130
55 Example of Markov Model of a Triplex Stage 135
56 Flight Control System Dependency Tree 136
57 Functional Readiness Variations During Scheduled Maintenance ... 138
58 Reliability Study Components 140
59 Near-Term ARCS Dependency Tree 141
60 Intermediate-Term ARCS Dependency Tree 142
61 Far-Term ARCS Dependency Tree 143
62 Near-Term/Intermediate-Term Markov Stage Model 145
63 Far-Term Markov Stage Model . 147
64 Voting Trade Study Configurations 157
65 Latent Sensor Failure Illustration 162
66 Latent Sensor Failure Model 167
67 Avionics Cost Factors 170
68 Line Maintenance Operation 176
69 Shop Maintenance Operation . 1 7 9
Al Application Control Law Overview 204
A2 Roll Autoland Control Law 205
A3 Pitch Autoland Control Law 206
A4 Yaw Autoland Control Law 207
A5 Go-Around Control Law 208
A6 Pitch Command Augmentation Control Law (With 7 Hold) . . . . . 208
A7 Roll Command Augmentation Control Law 209
A8 Yaw Command Augmentation Control Law 209
A9 Maneuver and Gust Load Alleviation Control Law 210
A10 Flutter Suppression Control Law 210
Al 1 Sensor and Mode Control Block-Near Term Application Model . . . 2 1 1
Al2 Servo and Display Block—Near Term Application Model 212
A13 Sensor and Mode Control Block—Intermediate Term Application Model. 213
Al4 Servo and Display Block Intermediate Term Application Model . . . . 2 1 4
Al 5 Sensor and Mode Control Block—Far Term Application Model . . . . 2 1 5
Al 6 Servo and Display Block—Far Term Application Block 216
IX
FIGURES (Continued)
No. . Page
Bl-A ARCS Functional Tree ^. 218
Bl-B Ground Test Functional Requirements : 219
Bl-C Background Tasks Requirements Tree . 220
Bl-D ARCS Redundancy Management Functional Tree 221
B2-A ARCS Software Structure 222
B2-B ARCS Software Structure 223
B2-C ARCS Software Structure 224
B2-D ARCS Software Structure Tree 225
B3-A ARCS Transition Diagram 226
B3-B Redundancy Management Transition Diagram 227
B4 Scheduling Table 228
B5 Tree Breakdown of System Status 241
B6 Update Function 252
B7 Continuous Signal Selection/Fault Detection Algorithm . . . . . . . . . 258
B8 Triplex Monitoring 262
B9 Duplex Monitor 263
BIO Triplex Monitor Trip Assessment 265
Bll Duplex Monitor Trip Assessment 266
B12 General Failure Status Assessment 267
B13 Servo Management Functional Breakdown . 271
B14 Control Law Tree 279
B15 Inputs and Outputs of Control Laws 280
B16 Ground Test Interrupt Processing 287
B17 Maintenance Test Functional Requirements . 289
B18 Test Structure 293
B19 Servo Test Functional Breakdown 299
B20 Display Function 302
B21 Ground Test Display Operation 303
Dl MCP-701 Throughput Prediction 317
D2 Aerospace Processor Throughput Comparison 320
D3 Aerospace Processor Input/Output Overhead Burden 321
D4 Cross-Channel Word Format 329
D5 Cable Failure Examples 331
D6 Summary of Candidate Sync Routines 338
D7 Skew Definition ' 340
D8 The "Wait" Algorithm for Frame Synchronization 341
D9 The "Master" Algorithm for Frame Synchronization 343
D10 "Time Window" Point of View for Failure Detection 344
Dll Failure Detection Based Exclusively on Wait Times 346
D12 Additional Failures Detected With Test For Sync Indicator Clear at
Beginning of Routine 348
D13 Identical Failures Detected in All Computers For Master Approach . . . . 349
D14 Candidate Triplex Actuator For ARCS 353
Fl Example of Stage Markov Model 398
F2 Flight Control System Dependency Tree 400
FIGURES (Continued)
No. Page
F3 7 , 401
F4 401
F5 CARSRA Structure 405
F6 MAIN Program Flow Diagram 407
F7 READIN Flow Diagram 408
F8 INITYZ Flow Diagram 409
F9 FORMT Flow Diagram 410
F10 COMPUTE Flow Diagram 412
Fll Subroutine AVAIL Flow Diagram 413
F12 Subroutine SETETY Flow 415
F13 Flow Diagram For Subroutine FAILPR 417
F14 INDFP Flow Diagram 418
F15 DEPFP Flow Diagram 419
F16 . . . . • 424
F17 Triplex Stage MARKOV Model 425
F18 Duplex Stage MARKOV Model 425
Gl GE's Coverage Study Plan-TASK OUTLINE 438
G2 Markov Model Concepts Necessary for Coverage Definition 440
G3 Simple Triplex Stage Model 442
G4 446
G5 448
G6 454
G7 455
G8 Lower Limits for (N/M) at 60% and 90% Confidence Levels 459
G9 ARCS Redundancy Management Block Diagram 461
Gil Laboratory Breadboard Block Diagram 462
G12 463
G13 Cross-Channel Coil Sum Current Monitor 464
G14 A/D-D/A Self-Test Loop 465
G15 Watchdog Monitor 466
G16 Breadboard Servo Electronics Schematic . . 467
G17 Breadboard Demonstration and Analysis Objectives 468
G18 Groundrules for FMEA and Fault Table 469
G19 Failure Modes Considered in Fault Table 470
G20 Outline of the Failure Modes and Effects Analysis (FMEA) Method . . . . 4 7 1
G21 Typical Form Used to Construct Fault Table and Perform FMEA 472
G22 Outline of Statistical Estimation Method 473
G23 Results of FMEA for Duplex-to-Simplex Transition Case 474
G24 Data From Random Sample Fault Insertion for Duplex-to-Simplex
Transition Case 475
G25 Overall Results for Servo Stage Second Failure Coverage Evaluation . . . . 476
HI Characteristic Waveform of Lightning Stroke 485
H2 Characteristic Electrical Transient Induced by Lightining 435
H3 Sensor Redundancy Management Strategy at Triple (or Quad) Redundancy . 488
H4 490
XI
I FIGURES (Concluded)
No. ' - Page
H5 Latent Failure Situation -: ; . . . 491
H6 Sensor Redundancy Management at Duplex Redundancy 495
H7 ICPS SSFD Algorithm 501
H8 Analytically Derived Altimeter Signal 515
11 Cross Channel Link Configurations 522
12 Triplicated Tee Configuration 524
13 Cross Channel Data Link Hardware—Block Diagram 532
xu
AIRBORNE ADVANCED RECONEIGURABLE COMPUTER SYSTEM (ARCS)
DEFINITION STUDY
B. E, Bjurman, G. M. Jenkins, C. J. Masreliez, K. L. McClellan, J. E. Templeman
Boeing Commercial Airplane Company
1.0 SUMMARY
The Airborne Advanced Reconfigurable Computer System, or ARCS, program (NASA con-
tract NAS1-13654) was a 1-year study to (1) define a new digital computing system function-
al concept employing contemporary ideas of fault tolerance, (2) substantiate system perform-
ance improvements through the use of advanced reliability assessment methods, and (3) apply
methods for assessing potential benefits and costs of fault-tolerant computer technology, as
applied to future commercial transport avionics. The study was sponsored by NASA Langley
Research Center under the Terminal Configured Vehicle (TCV) program. Boeing Commer-
cial Airplane Company was the prime contractor, with General Electric, United Airlines, and
Boeing Computer Services as subcontractors.
The ARCS represents a unique combination of airline, aircraft manufacturer, and avionic
systems manufacturer participation in formulating a new fault-tolerant airborne computer
system architecture. Comparison of the resulting conceptual design with contemporary
system technology indicates improved airline operator profitability. This is a consequence of
enhanced system availability, lowered cost of replicated elements, and improved system
maintainability.
Other significant ARCS program results were the development of a reliability analysis tool
for evaluating redundant fault-tolerant systems and the adaptation of advanced software
design principles to define a stringent methodology for designing software for fault-tolerant
systems. The important element of the software design methodology is the design visibility
given to all parties involved in the design process.
Three primary tasks were involved in formulating the ARCS concept: criteria development,
design synthesis, and design evaluation.
Operational considerations, such as airline economics and flight safety, associated with an
autoland function and a control wheel steering (CWS) type of stability/command augmenta-
tion function were used to define survivability and functional availability criteria for the
ARCS. A system architecture with autonomous, redundant channels, capable of operating
down to simplex without pilot intervention and expandable to quadruplex redundancy, was
identified as a basic design requirement. An integrated test function to ensure system integrity
and to enhance system maintenance was a further requirement.
In the developed design concept, system reconfiguration is managed by software processes
driven by software and hardware monitors. The software was designed using top-down,
structured programming principles to ensure maximum integrity and visibility. The single-
box-per-channel hardware design uses a directly operable input/output concept with cross-
channel communication performed via dedicated one-way optical data links. Sensors, mode
controls, and servos are dedicated on a channel basis.
A fault analysis using a functional simulation demonstrated the validity of the reconfiguration
processes down to the functional level from which circuit design and software implementation
would begin. Survivability and availability of the ARCS design, including significant alternate
design configurations, were assessed with the CARSRA reliability analysis program developed
for the ARCS.
The analysis showed a fivefold improvement in survivability of the CWS function implemented
in ARCS compared to a CWS function implemented with contemporary airborne computer
technology (i.e. fail-operational/fail-passive capability in a triplex configuration), as repre-
sented by the baseline GE MCP 703 Whole Word Computer System (WWCS). A quadruplex
ARCS with contemporary state-of-the-art component failure rates was shown to meet the
survivability requirement for a flight-crucial fly-by-wire CWS control function, while a
quadruple two-fail-operational/fail-passive system would not quite satisfy this requirement.
The survivability of the ARCS and WWCS autoland functions, when initially fault free, were
both shown to well satisfy the requirement. The ARCS will also satisfy the specified auto-
land survivability criterion with one failure incurred before the initiation of the critical auto-
land phase. This results in an improved ARCS autoland function availability which means,
in comparison with the WWCS, a factor of 4 reduction in diversions due to autoland un-
availability for Category III operations. The greater functional survivability and availability
results from reconfiguration to simplex. For some users, simplex may not be an acceptable
mode of operation.
The cost/benefit analysis, using an airline model defined in the study, showed that the
improved availability of the ARCS, the lower cost of system acquisition, and the potential
maintenance cost reductions result in an annual saving in excess of $4000 per ARCS-equipped
aircraft compared to WWCS-equipped aircraft. Additional savings were identified but not
quantified.
Low-visibility operation is not standard procedure by all airlines today. Category III autoland
in particular is limited to a few airlines and airports worldwide. Low-visibility capability is
not a dispatch requirement but an airline option. It was concluded that the full benefit of
fault-tolerant computer technology will be reaped when the functions provided by the avionic
systems become dispatch critical or otherwise mandated for use on every flight. Fully imple-
mented active controls in future aircraft will require such functions.
Not all aspects of duplex-to-simplex reconfiguration were covered by the ARCS study; work
remains in the areas of analytical redundancy (fault monitoring of simplex signals) and
techniques for predicting and evaluating "coverage," i.e., the likelihood of function survival
given a fault in the duplex state, with a high degree of confidence. Assessment of "coverage"
is a central problem in designing fault-tolerant systems. Credible reliability estimations hinge
on the level of confidence at which "coverage" values can be estimated for duplex to simplex
operations. Application of an ARCS-developed method, based on random fault insertion, to
a gate-level simulation/emulation of a candidate computer is seen as the next step in fault-
tolerant technology consolidation.
The ARCS data supports application of fault-tolerant computer technology to increase the
effectiveness of flight-critical avionics at lower cost to the operator. Fault-tolerant computing
will be a significant element in achieving fly-by-wire and active controls technology, as well
as general use of Cat III capability, at acceptable cost.
,3
2.0 INTRODUCTION
The Airborne Advanced Reconfigurable Computer System, or ARCS, program (NASA
contract NAS1-13654) was a 1-year study to (1) define a new digital computing system
functional concept employing contemporary ideas of fault tolerance, (2) substantiate system
performance improvements through the use of advanced reliability assessment methods, and
(3) apply methods for assessing potential benefits and costs of fault-tolerant computer
technology, as applied to future commercial transport avionics. The study was sponsored by
NASA Langley Research Center under the Terminal Configured Vehicle (TCV) program.
Boeing Commercial Airplane Company was the prime contractor, with General Electric,
United Airlines, and Boeing Computer Services as subcontractors.
The ARCS program joined a team of airline, aircraft manufacturer, and avionic systems
manufacturer personnel to formulate a new fault-tolerant airborne computer system archi-
tecture. The resulting conceptual design was compared with contemporary system technology
to determine its impact on airline operator profitability. System availability, cost of replicated
elements, and system maintainability were .parameters in determining the new system's
benefit/cost to an airline.
Other significant ARCS program elements were to identify or develop an appropriate reliability
analysis tool for evaluating redundant fault-tolerant systems and to adapt advanced software
design principles to define a stringent methodology for designing software for fault-tolerant
systems. An important part of formulating the software design methodology was to identify
those techniques that provide design visibility to all parties involved in the software design
and implementation processes.
2.1 PROGRAM OVERVIEW
Three primary tasks were involved in formulating the ARCS system concept: criteria develop-
ment, design synthesis, and design evaluation. Criteria development covered all factors on the
aircraft operational level that influence the design of a fault-tolerant computer system. Of
those, economic considerations address the projected functional scope of the system, the
expected functional availability, and the required maintainability. Safety considerations yield
criteria specifying the required survivability of critical functions, which for this study
included controls-configured vehicle (CCV) fly-by-wire (FBW) functions and the Cat III
autoland function.
The operational criteria were transformed into design requirements for a fault-tolerant
(redundant) digital computer system, specifying wherever possible design principles that
would obviate single-point system failures. The following fundamental design requirements
were formulated for the ARCS triple-channel configuration.
• Each computer unit shall independently assess its, and the system's, operational status.
• No computer or combinations of computers shall interrupt another computer's normal
operation.
• The system's redundant operation must start up, and recover from, transient fault
conditions without flight crew intervention.
• The system design must be able to achieve functional operation down to a simplex
string of operable elements and be architecturally expandable to at least quadruplex
redundancy.
The core of the ARCS functional concept, developed from the above requirements, is the
reconfiguration processes of failure isolation, transient fault recovery, and redundancy de-
gradation. The key to reconfiguration is fault detection, with the ARCS having five fault
monitor functions that directly, upon activation, initiate action leading to reconfiguration.
They are: power interrupt, sensor signal failure detection, computational output monitors,
servo fault detectors, and a watchdog monitor. The watchdog monitor is mechanized in
hardware to detect faults that interrupt the real-time processing of the computer. In addition
to these primary, or first-level, monitors, second-level hardware failure detectors and hard-
ware/software self-test functions provide fault localization when the system is operating at
a duplex redundancy level.
In the flight control system configured around the ARCS, redundant channel processes are
consolidated at two system voting nodes: at the sensor signal input to the control law com-
putations and at the servoactuator output. The sensor signal selection process is mechanized
in software, and the servo output voting node is a hydromechanical mechanization. Both
are assumed to be designs with absolute integrity for first-failure detection.
The majority of the ARCS reconfiguration mechanisms are software processes, where it has
been recognized that the crucial aspect of a fault-tolerant system's software is to ensure
design integrity through design visibility. In the ARCS design, this was achieved through
the strict application of top-down design principles and structured programming techniques,
as well as the establishment of rigid documentation standards compatible with these concepts.
Since the ARCS concept emphasizes software processes to achieve system fault-tolerant
qualities, the hardware must be viewed as the vehicle to facilitate an effective software design.
The ARCS primary hardware element is a single computer unit, replicated on a per-channel
basis to build up a triplex- or quadruplex-redundant fault-tolerant system. The computer
unit contains a processor and all channel interface electronics. Sensor, mode control, and
servo hardware interfaces are dedicated on a channel basis. All cross-channel communication
is accomplished via dedicated one-way, serial, optical, digital data buses that independently
interconnect each computer to each other computer, providing complete electrical isolation
between channels. Each computer exclusively controls the engagement and shutdown of its
own servos.
The ARCS concept was analyzed with respect to its fault-tolerant performance with two
major objectives in mind: (1) to verify through fault analysis that the system concept in
fact had the required fault-tolerant qualities from a functional point of view and (2) to
assess, through a reliability analysis, the merit of those fault-tolerant qualities from a
probabilistic point of view.
:s
Simulation was used as a tool in the fault analysis because the reconfiguration processes,
though each conceptually understandable, were difficult to validate using only a paper
analysis. The simulation demonstrated that the ARCS fault-tolerant design concept was
valid, even though the sensor signal selection/failure detection algorithms for the duplex-to-
simplex degradation were not fully developed.
The reliability analysis work involved reviewing available reliability analysis methods and
tools to identify those suitable for use in the ARCS study and assessing the reliability of the
ARCS design and selected configuration alternatives using the related method and tool. Ten
different reliability estimation programs were surveyed, six from Boeing and four from NASA.
None could adequately handle the ARCS reliability assessment task; therefore, a new tool
called CARSRA (Computer Aided Redundant System Reliability Analysis) was developed.
CARSRA was used to generate the projected reliability of the ARCS within the scope of the
defined commercial transport operational applications. Assuming present-day hardware
failure rates, a quadruplex ARCS-type system is required to meet the fly-by-wire reliability
requirement, while a triplex ARCS will meet the Cat III autoland functional reliability with
one module failed (sensor, computer, or servo) at the Cat III alert height. A comparison
with contemporary redundant system design approaches showed that a fail-op/fail-op
quadruplex system design would fall short of the fly-by-wire requirement, and a fail-op
triplex system would have to have all modules working at the alert height to meet the Cat III
requirement. The ARCS, with its ability to meet the reliability requirement for the Cat III
case with one module failed, reduces by a factor of 4 the diversions due to autoland unavaila-
bility, compared to the contemporary triplex system.
A new approach for measuring a system's "failure coverage," based on a failure analysis of a
randomly selected set of failure modes, was developed as an outgrowth of the ARCS analysis
studies. The approach, shown to be feasible byan analysis/laboratory experiment, is looked
upon as a potential cost-effective method for demonstrating a fault-tolerant system's func-
tional success probability.
An assessment of the cost effectiveness of applying ARCS technology was carried out in two
parts: an analysis of airline cost-of-ownership for an ARCS maintained in a Cat HI operational
status and an analysis of the cost effect of providing an integrated system test function. Three
primary factors were considered in these analyses: acquisition cost, maintenance cost, and
costs associated with schedule interruptions caused by not having an operational Cat II/
Cat III system. United Airlines supported these studies by providing a 150-airplane airline
model with route, schedule, and cost data associated with operating and maintaining flight
control electronic equipment. Study results show a potential annual saving to the airline
operator of $4000 per aircraft when compared to a system representing today's technology.
2.2 STATEMENT OF WORK SUMMARY AND TCV PROGRAM RELATIONSHIP
The criteria development, design synthesis, and design analysis summarized in the previous
section were conducted in response to six tasks defined in the ARCS contract statement of
work. The objectives of these tasks are summarized below.
The objectives of Task I were to establish design criteria and fault-tolerance management
requirements for an advanced reconfigurable computer system. The criteria and requirements
were to be applicable to projected future commercial transport avionic systems.
The objective of Task II was to develop a candidate ARCS that overcomes the fault-tolerance
performance limitations of the General Electric MCP-703 triply redundant Whole Word
Computer System (WWCS) used as a program baseline. The WWCS was developed under
Task 4 of the DOT/SST Technology Follow-On Program (contract DOT-FA72WA-2893).
In particular, the ARCS development was to result in functional capabilities that achieve
the following characteristics of a reconfigurable triple-channel computer system:
• Triple-, dual-, and single-channel redundancy operation
• Automatic computer restart following transient faults (redundancy recovery)
• Automatic signal selection and failure detection with recovery capability
• Automatic system monitoring and automated system test
The objectives of Task III were to assess the WWCS and ARCS configuration operational
reliability as a function of fault probabilities, to establish the ARCS operational reliability
gains and trends in comparison with the WWCS, and to determine ARCS fault-tolerance
performance.
Task IV was to develop system test design criteria, design a system test function, and establish
its cost-of-ownership effects when incorporated into the ARCS.
Task V objectives were to determine the fault-tolerant/reliability benefits of ARCS relative
to the WWCS and to determine the associated cost benefits of ARCS technology to the
airline operator.
Task VI was to assess the feasibility of modifying the WWCS and other suitable commercially
available airborne digital computers to conform to the ARCS configuration.
Interactions between the ARCS program tasks are depicted in figure 1. Input to the ARCS
development task is the set of projected automatic flight control system design criteria
compiled under Task I. The configuration definition resulting from the development task
supplies inputs for the cost/benefit assessment, Task V, of the contract schedule. Develop-
ment of the integrated ARCS test function, Task IV, was conducted in parallel and strongly
Interacted with Task II. Task III, performance and reliability evaluation, interacted in an
iterative way with the design task.
The scope qf design application was limited to only flight control functions to obtain a well-
defined set of requirements, from one technology, where the need for fault-tolerant process-
ing has been clearly established. Since the most severe system design requirements of a jet
transport exist in the flight control area, the general validity of the ARCS program results
also applies to other avionic systems applications. On a broad basis, the results of the ARCS
program contribute to the Terminal Configured Vehicle (TCV) program and its goals of
advancing commercial transport technology.
00 
, 
\ 
\ 
\ 
\ , 
, 
, 
, 
, 
, 
" 
" " 
" 
" ... 
" 
DEVELOP 
FAULT TOLERANT 
COMP\:JTER SYSTEM 
CONCEPT· 
(TASK t·l) 
---.., 
--..... 
" . 
...... , 
. '-
........ -' .", 
--- -- - ---
Figure 1.-ARCS Tasks Relationships 
-..... 
.......... 
..... , 
\ 
\ 
\ 
\ 
A'RCS 
R~PORT . 
, 
" " 
\ . 
, 
, 
, 
\ 
\ 
\ 
\ , 
I , 
I , 
, 
, . 
Although the ARCS study is a part of the NASA TCV program, the NASA 515 airplane was
not considered a suitable application model to govern the development of ARCS design
requirements. For this purpose, functional and operational requirements of future, advanced-
technology transports were postulated. The TCV program test vehicle, however, could easily
serve as a flight test bed for a hardware/software implementation of the ARCS concept
resulting from this study.
2.3 REPORT ORGANIZATION
The balance of this report is organized into four major sections, a concluding section, and
several appendixes. Section 4 discusses the operational criteria that influence the design of
an airborne fault-tolerant computer system, as well as the resulting system design require-
ments. Section 5 describes the developed ARCS concept from a functional, software, and
hardware design point of view. Section 6 discusses the methods and results of the analyses
performed with respect to fault-tolerant performance and cost/benefits of the developed
concept. Section 7 presents the essence of a fault-tolerant computer system specification
and discusses the results of evaluating the adaptability of existing computer systems to the
ARCS specification. The concluding section, section 8, summarizes the study results and
provides concluding comments on the overall ARCS program.
To maintain visibility of the major themes presented in the main report, extensively detailed
material is presented as appendixes. Application control laws, software design details, hard-
ware design analysis, simulation results, and reliability analysis details are examples of
material contained in an appendix.
2.4 ACKNOWLEDGMENT
The Boeing Commercial Airplane Company wishes to acknowledge the General Electric
Company, Aerospace Controls and Electrical Systems Department, and United Airlines,
Maintenance Operations, for their support during the ARCS program. Messrs. L. E. Fairbanks
and R. P. Kurlak of GE deserve recognition for their contributions to the system design/
analysis tasks, and Mr. H. Takeuchi of United Airlines is recognized for his contribution of
airline operational parameters to the system test design and cost/benefit assessment tasks.
j 3.0 SYMBOLS AND ABBREVIATIONS
AC Advisory Circular
A/D analog to digital
AEEC Airline Electronic Engineering Committee
AFCS automatic flight control system
ARCS Airborne Advanced Reconfigurable Computer System
ATA Air Transport Association
ATC air traffic control
ATE automatic test equipment
ATT Advanced Technology Transport
BCAR British Civil Airworthiness Requirements
BIT built-in test
BITE Built-in test equipment
CARSRA Computer Aided Redundant System Reliability Analysis
Cat II Category II
Cat III Category III
CCV controls-configured vehicle
CDP control and display panel
CIU computer interface unit
CLR clear (instruction)
CPU central processor unit
CRT cathode-ray tube
CTOL conventional takeoff and land
CWS control wheel steering
10
D/A digital to analog
DADS digital air data system
D/D digital to digital
DELS direct electrical linkage system
DG directional gyro
DMA direct memory access
DME distance measuring equipment:
DOIO directly operable input/output
DOT Department of Transportation
DRO destructive readout
EIA Electronic Industries Association
EMI electromagnetic interference
FAA Federal Aviation Administration
fail-op fail-operational
FAR Federal Aviation Regulations
FEW fly-by-wire
FF flip-flop
FIFO first in first out
FMEA failure mode effect analysis
GE General Electric Company
GSE ground support equipment
HLH heavy lift helicopter
ILS instrument landing system
I/O input/output
11
INS inertial navigation system
ISADS integrated strapdown air data system
kops thousand (kilo) operations per second
K thousand words of storage
LED light-emitting diode
LIFO last in, first out
LRU line replaceable unit
LSB least significant bit
LSC local sync command
LSI large-scale integration
LVDT linear variable differential transformer
MCP modular control processor
MEL minimum equipment list
MLS microwave landing system
MPX multiplex
MSB most significant bit
MTBF mean time between failure
PFCS primary flight control system
PROM programmable read-only memory
R/A radio altimeter
RAM random access memory
RCSS Reconfigurable Computer System Simulation
RESRA Redundant System Reliability Analysis
R-NAV area navigation
12
ROM read-only memory
SAS stability augmentation system
S/H sample and hold
SLI synchronous logic interface
SMAR serial memory address register
SOV shutoff valve
SPBP split-phase bipolar
SSFD signal selection/failure detection
SST supersonic transport
STOL short take-off and landing
STP system test panel
STRU servo transmit/receive unit
TCV Terminal Configured Vehicle
TMR triple-modular redundancy
TTL transistor-transistor logic
UA United Airlines
UART universal asynchronous receiver/transmitter
VG vertical gyro
VOR very high frequency omnirange
WWCS Whole Word Computer System
XMIT transmit (instruction)
2-D, 3-D, 4-D two-, three-, four-dimensional
13
14.0 DESIGN CRITERIA
A design requirement or design goal is normally expressed as, or associated with, a criterion,
i.e., a standard, rule, or test for measuring the excellence, fitness, or correctness of the particu-
lar characteristic specified. Airplane profitability and flight safety are the two criteria that
ultimately dictate the design of an airborne processor system used for flight control in a
commercial transport.
Design requirements and design goals for a subsystem must necessarily reflect requirements
and goals formulated on the higher system level. Thus, identifying design requirements for
the computer system starts with defining ARCS-related requirements and goals formulated
on the airplane operation level. The criteria defined on the operational level are subsequently
interpreted and formulated into criteria applicable to the design of the computer system, as
illustrated by figure 2.
Past experience of analog and digital flight-safety-critical computer systems is integrated
into the interpretation process. Other influences stem from the adopted goals and objectives
of the ARCS program as expressed by the contract statement of work. The most far
reaching of these goals is to achieve high fault tolerance by providing the capability for a
triplex system to degrade in redundancy into an operational simplex system configuration.
The computer system is a subsystem of the total airborne vehicle; it interfaces or interacts
with other subsystems in the airplane. Design optimization of the reconfigurable computer
system cannot be accomplished out of context, i.e., without considering the total vehicle
requirements and the functional and operational environment of the computer system; since
the interfacing or interacting subsystems contribute to the final outcome of system reliability
and fault-tolerance performance trades, they must be adequately represented in the analyses.
A significant portion of establishing the ARCS design criteria therefore involves identifying
the functional context and the interface environment into which an operational ARCS will
be integrated.
4.1 OPERATIONAL CRITERIA
A continuing evolution in flight procedures and airframe, sensor, and electronic circuitry
technologies will take place during the timespan in which an ARCS would become applied
to an airplane. Advancements in airframe and sensor technologies will influence the functional
complement of an ARCS: the further into the future the requirements are projected the
more advanced functions the fault-tolerant computer system will be required to perform.
From a commercial transport design point of view, considering functional reliability and
mission flight safety, a controls-configured vehicle (CCV) type of fly-by-wire (FEW) is the
most advanced type of function to be anticipated for the foreseeable future. The autoland
function, although implemented in aircraft today, represents another significant design point
with respect to the design of a highly reliable, fault-tolerant computer system.
14
8-1/2 X 11 I N C H CROPS
AIRPLANE
OPERATIONAL
REQUIRE -
MENTS
ARCS STUDY
OBJECTIVES AND
ASSUMPTIONS
PAST EXPERIENCE
OF FAULT TOLERANT
DESIGN .
INTERPRETATION
PROCESS
ARCS
DESIGN
REQUIRE
MENTS
Figure 2.—ARCS System Requirements
The advancement of electronic circuitry technology, including the emergence of airborne
qualified microprocessors, will impact the physical aspects (piece parts) of the ARCS hard-
ware much more than it will the functional architectural aspects. The system functional
architecture will be dictated largely by flight control design considerations.
For the purpose of defining the ARCS design criteria, it was postulated that the CCV/FBW
and autoland functions would be by far the most critical functions to be implemented in
the ARCS. Therefore, specifying reliability and fault-tolerant performance levels for these
two functions would automatically ensure adequate reliability and performance levels for
other ARCS functions.
To adequately treat the advancements in sensor and airframe technologies, it was further
postulated that three application models be used for the ARCS development: a near-term
model, using existing sensors, to provide the basis for comparison with the baseline WWCS
technology; an intermediate-term model based on a step advancement in sensors, postulating
a digital, integrated air-data strapdown sensor system; and a far-term model that anticipates
the full implementation of CCV/FBW airframe technology.
The airplane operational requirements pertinent to the design of an ARCS are discussed in
the following two sections. They break down into two groups, as illustrated by figure 3:
criteria derived as a result of economic considerations, i.e., those addressing operational
profitability (sec. 4.1.1), and criteria established to ensure adequate flight safety (sec. 4.1.2).
The ARCS design requirements resulting from interpretation of the operational requirements
are presented in section 6. These design requirements are divided into three groups address-
ing the functional design, software design, and hardware design.
4.1.1 ECONOMIC CONSIDERATIONS
Except for equipment required by regulation for flight safety, no new avionic systems are
installed in airplanes without expectations of improved overall airplane profitability. There
are three areas in which an ARCS has the potential to improve aircraft cost efficiency:
• The functional scope of the system, which influences the initial system cost as well as
the potential operational benefits of the system
• The functional readiness of the system, or functional availability, which determines the
degree of achieving the potential operational benefits
• The system maintainability, which influences the burden of maintaining the system at
the functional status required to achieve the operational benefits
How do these three characteristics translate into design requirements for the computer system;
what are the relationships between operational and design criteria?
The functional scope of the flight control system determines the computational speed and
memory capacity requirements, as well as the sensor, servo, and mode control/display
interface requirements, that the fault-tolerant computer system must satisfy.
16
3-172 X 11 INCH CROPS
AIRPLANE OPERATION
ORIENTED
REQUIREMENTS
SECTION 4.1.1
ECONOMIC
CONSIDERATIONS(PROFITABILITY)
SECTION 4.1.1.1
FUNCTIONAL
SCOPE
SECTION 4.1.1.2
SYSTEM
FUNCTIONAL
READINESS
SECTION 4.1.1.3i
SYSTEM
MAINTAINABILITY
SECTION 4.1.2
SAFETY
CONSIDERATIONS(ACCIDENT RISK)
jSECTION 4.1.2.1
FLY-BY-WIRE
SECTION 4.1.2.2
AUTO LAND
Figure 3.—Airplane Operational Requirements Breakdown
17
The functional readiness or availability translates directly into the computer system's
capability to survive failures and to handle multiple component failures in the interfacing
sensors and servos; it will drive the designer to seek a solution with multifailure tolerance.
The operational system maintainability requirements will result in a set of design requirements
aimed at the implementation of computer built-in test (BIT) functions for the computer
itself and for its interfacing peripherals, including the man/machine interface for the test
functions.
The following sections discuss the above three areas and identify operational requirements
for the definition of ARCS design requirements.
4.1.1.1 System Functional Scope
From an airline operational point of view, automatic flight control functions fall into one of
the following three categories:
• Dispatch-Critical Functions—functions without which an airplane cannot legally be
dispatched on a revenue flight
• Flight-Mandated Functions—functions without which certain operations cannot be
conducted and without which an airline may elect not to dispatch an airplane under
certain conditions
• Pilot Workload Relief Functions—functions that do not impact an airplane's dispatch
status nor significantly affect the flight plan, yet have a definite convenience value to
the pilot
From the flight control system designer's point of view, functions in these categories belong
in three different categories (see fig. 4):
• Flight-Crucial Functions—functions without which an immediate unconditional flight
safety hazard exists
• Flight-Critical Functions—functions without which a potential short-term flight safety
hazard exists conditioned upon factors such as environment or flight condition for which
the pilot has the option to circumvent by taking an operational penalty (flight diversion)
• Noncritical Functions—functions without which no short-term impact on flight safety
exists
The ARCS design requirements identified herein project an increasing reliance upon automa-
tion and active controls technology to improve the productivity of transport airplanes. The
increasing utilization of more flight-crucial functions with time, reflected by the three ARCS
application models defined below, is based on an expectation of continued evolution of
sensor and electronic hardware technologies, resulting in improved reliability and cost.
18
8-172
 X
 11
 INCH
 CROPS
CO
Q
 
O
0
 H
—
1
 
O
^
 
z
0
 U
.
LL
.
1
-
 
LU
O
 
—
—
 '
 
—
 1
 
J
i
—
 •
 
LU
Q
-
 
O
L
Qcr
CO
 Q
LLTO
Q
 
-
O
 Q
LUCO
 
>
r>
 z
o
:
 i
C
Jf£
oeH
-
 o
LL
.
 i
—
<
 1
-
z
 
o
o
 z
 z
<
 r>
t
-
 
—
1
 LU
C_>
 Q
-
<
 
li
-
CU
 1
-
 O
s
:
 x
•-•
 co
 co
—
 i
 C
O
O
 
-1
 O
Z
 
U
 
1
0I
COzOt—•U
QLUI<
 C
O
Q
 
Z
Z
 0
<
•
—
 1
^
—
 .
^
"
 
I
5
_
 
r
—
O
K
 
Z
I
 1
3
CD
 
U
_
•
-^
_iU.
>
-
1
-
 Q
-
l
<
 Q
C
Q
O
 ^
<
 1
-
 0
Q
.
 n
 c
c
<
rf
*
 ^
'T
^^
 ^
.
<
-)
 
1
i
—
 O
^
 1-1
 o
1
—
 1
 t
—
 1
o
!-(-(
-
«
 
=
)
o
 cj
 <
z
 
.
o
cr
 
«
1
 —
 
LU
 1
 
—
o
 i
-
 cj
Z
 L
U
 Z
<
 3
Z
 
L
U
O
 L
U
—
 
_l
 L
U
t
-
 0
3
 O
<
 •
—
o:
 co
 co
LU
 CO
 CO
Q
-
 O
 0
O
 Q
 
1
=!>
COzoI-ozLUCJ1-C£OH
-
Q
-
COQ
<
S
• YAW DAMPER
• STOL SAS
• MANEUVER LOAD
ALLEVIATION
• FLY-BY-WIRE
• FLUTTER SUPPRESSION
• HARD SAS
FLIGHT REGIME
RESTRICTED AFTER
LOSS OF FUNCTION
1
FLIGHT IMPOSSIBLE
OR IMPRACTICAL
AFTER LOSS OF
FUNCTION
fc
^
 
~
x
^
COz01-oznLUCJ1-o:CJI—XCDLu
'
"1•2CD
;QC-.03ca|c5••I!^ . ^s
,
1
 to
;
 c'{2l
_
0
CJ
f
-
 o
 z
 c
c
x
 Q
:
 CD
 LU
o
 I
-
 
—
 i
-
~
 
z
 co
 
~
—
'
 O
 
LU
 C
£
.
LU
 O
 Q
 C
J
-
19
The general system interface architecture, shown in figure 5, shall be capable of handling
simplex, duplex, triplex, or quadruplex redundancy, although at the present time the required
redundancy level cannot be established for all anticipated applications with a high degree of
confidence. The basic ARCS architecture was developed for a triplex system configuration.
The projection of the functional scope of each of the ARCS application models is based on
assumptions relative to the technological development of sensor, airframe, and electronic
circuitry technologies at three time periods designated as near term, intermediate term, and
far term, as illustrated by figure 6. The near-term application is characterized by today's
existing sensors and commercial transport airframe technology. The far-term model repre-
sents airframe technology beginning to be implemented in production fighters today, which
implies historically that it is 10 to 20 years in the future for commercial transport applica-
tion. The intermediate term is a step in between, which takes advantage of new sensors
currently under development that simplify system interfaces and allow a more efficient
system organization through digital data communication.
The assumptions made for sensor and airframe technology advancement, and the functional
capabilities deemed to be desirable and justified for each application time period, are discussed
below. Typical control law configurations for the flight-critical functions and interface
requirements are given in appendix A.
Near-Term Application Model. —The near-term commercial transport being designed today
will be assumed to have a basically stable airframe, the only probable dispatch-required auto-
matic flight control function will be a yaw damper. This conservative assumption reflects
the Boeing design philosophy for an airplane that would be introduced in 1980. The primary
flight control system will be mechanical or direct electrical links not requiring sensor feedback
processing.
The sensors available for the near-term application will be those sensors available and proven
today. The traditional organization of automatic flight control separated from navigation/
guidance functions will prevail, and no integration of mode select panels for those functions
will be assumed. An area-navigation (R-NAV) computer will provide steering command via
serial digital bus to the flight control computers, and the air data computer will be digital.
Attitude and heading information from vertical/directional gyros (VG/DG) will be standard,
with the inertial navigation system (INS) being an available option for long-range aircraft.
Roll and pitch rate gyros will not be included; rates will be derived from attitude information.
The flight control mode-select panel will communicate with the computer via discretes for
flight-critical functions and via serial digital links for noncritical functions. The near-term
application model will utilize miniprocessors.
The functional complement of the near-term model will consist of the following capabilities:
Flight-critical functions:* Category Illb autoland
Low-speed yaw SAS
20
3-1/2 X 11 INCH CROPS
SENSOR AND
MODE CONTROL
BLOCK A
SERVO AND
DISPLAY
BLOCK A
B
B
COMPUTER
BLOCK
B
SYSTEM TEST
PANEL
ARCS
APPLICATION
MODEL
i-i
I D '
Figure 5.—General System Interface Architecture
8-1/2
 X
 11
 INCH
 CROPS
CJe>LUoc.
 co
I
-
 
_i
to
 z:
to
 <
-i
 z
:
o
 o
Of
.
 
\
-
LU
 
.
>
 
Z
3
 
-J
—
 LL
 Q
.
\
-
 
\
 
a:
O
 Q
 
—
I
DC
.
LU
\\\
tocc0totoLUOOQ£Q_
r
 
~
 ^
 "
 
"
o
 to
OH
 Q
o
 <
>
-
.
 to
s:
 
—
LU•z
.
<_
l
tO
 Q
-
>
-
 C
X
<
 
•
—
_
l
 <
Q
-
to
 
_J
—
 
<
Q
 I
—•
—
 *
1
—
 CT>
o:
 «
—
o
 a
o*
—
 •
to
to
 <
—
1
 2:
 OQ
O
 0
|
—
 
•
—
 »
to
+
 
«
_i
 to
O
 O
 1
—
1
—
 L
L
 
<
O
 Q
-
 O
o_loIO
22
-
Noncritical functions: Control wheel steering (CWS)
Altitude hold/select
Airspeed hold/select
Mach hold
Vertical speed hold/select
Heading hold/select
2-D/3-D R-NAV coupler
*High-speed yaw SAS will be a separate analog system. This will be the only dispatch-
required AFCS function.
Intermediate-Term Application Model. —The intermediate-term airplane will be assumed to have
taken half the step toward fly-by-wire; limited-authority series servo control augmentation
will be used in all three control axes, but an unaugmented reversionary mode (mechanical
or direct electrical link) will be available for operation within a restricted flight envelope.
This airplane could be a STOL transport or a CTOL transport with relaxed basic airframe
stability requirements; an operating automatic flight control system would therefore be a
dispatch requirement.
The major improvement in sensor technology for this application model will be the availability
of an integrated strapdown air data system (ISADS), providing all attitude, rate, acceleration,
and air-data-derived signals in a digital format. In addition, a digital low-range radio altimeter
with improved reliability is assumed to be available. Except for airplane-mounted force and
position sensors and instrument landing system (ILS) deviations, all sensor signals will be in
digital format. A digital display system using CRT displays will be assumed.
A completely integrated approach to the cockpit design and the system-to-pilot interface
has been assumed for the intermediate-term model. All noncritical flight control functions
are assumed to be performed by the navigation-guidance computers, as indicated by figure 7.
The functional separation shown in figure 7 represents the anticipated organization of a
future integrated navigation/guidance/flight control system for a commercial transport. The
functional grouping provides for flight-critical functions in a highly reliable, fault-tolerant
computer system with simple, redundant, electrically separated man/machine interfaces, and
noncritical functions in lower redundancy with nonredundant, more complex man/machine
interfaces. The configuration described by figure 7 has evolved as a result of the experience
in fault-tolerant flight control systems and integrated avionic systems research, development,
and testing conducted by Boeing during the past several years.
A microprocessor (LSI) implementation of the ARCS processor will be assumed possible for
the intermediate-term model. Reliability and failure effect considerations must determine
the optimal approach to hardware implementation.
The functional complement of the intermediate-term application model will be as follows:
Flight-critical functions: STOL SAS
Full-time command/stability augmentation
Cat IHb autoland
•23'
-172
 X
 11
 INCH
 CROPS
NON-
 r FLIGHT CRITICAL
\ COMMAND AUGMENTATION
\ AUTOLAND
\ SYSTEM INTEGRITY ASSURANCE
\ v SERVO COMMAND
\ \ SERVO ENGAGE
— \ *, *J . 1
_i<—cruh-X13H^_lJ-
1
1
MODE
AND PARAMETER
SELECT PANEL1
LUO
 
>
z
 
<
<
 
_l
a
 
Q
-
—
 
_IC
O
^
 o
 —
O
 D
C
 Q
\
1
 1
-
 
h
-
>
 Z
Q
 
~
<
 O
Z
 Z
NAVIGATION
GUIDANCE
DISPLAY GENERATION
AUTO CRUISE
 x
AUTO THRUST \
g
 
L 1
/
h
*H
SERVOS
t
 
,
^^
•^
•^
H
'
 
'
«
CO
 
tK
-^*n
 
o
^
 o
-
•
*
 
%
2
<
 
C
O
 
z
 L
U
1
—
 O
 
C
C
.
 
LU
 C
O
x
 
—
 •
 O
 
o
:
 co
O
 
t
—
 C
O
 
U
U
 
Z
—
 •
 
—
 •
 z
 
i
—
 ui
 
o
—
 
_
1
 Q
£
 L
U
 
U
J
 O
 
—
u
.
 o
 co
 
z
 tr
 i
-
'
 r1
 
'
 
'1
'
^
•
»
FAULT TOLERANT
ARCS COMPUTERS
ii
UJu
.
.
.
 
zc
.
30z0h
-
<
^
 
to
—
 
<
•
 z
1
 I
 t
1
COMPUTERS
k1
ts
 
_i
 
s
•
—
 
<
 C
O
_IOQ
:
u
_
 
~
 o
1
 
1
—
 CO
z
 
~
 z
O
 
C
£
 L
U
-
 
z
 
c_
>
 c
o
t
-
 
LU
 C
O
_
J
 
O
<
 
_
i
 a
o
co
 
o
 a
:
 o
a
 
~
i->
<
 to
 a
 z
 a
CO
 
—
 1
 <
 
O
 L
U
.
—
 
—
 o:
 o
 co
,
 
-
 
-
>
 
"<_ico
 
ce
—
 
o
a
 
h
-
<
>
-
 
—
 •
a
:
 
u
<
 
z
.
 
2
:
 3
1
 
a
:
 z
 2
•
—
 
0
-
 
«
1
—
 CO
CO
 O
^
 
0
 
=
3
 
>
Z)
 X
 L
U
<
 
1
-
 C
O
CO5
 
C
C
 L
U
\
=8§
1sLCO•'1!c§1 -cNavigation/Guidance/Flig
i
Figure 7.— Integrated
24
Noncritical functions: * Track angle select/hold
Flightpath angle select/hold
Autothrust modes
2-D/3-D/4-D command generation
*Performed by nav-guidance computer
Far-Term Application Model.—POT the far-term application model, an Advanced Technology
Transport (ATT) concept was selected. This is a controls-configured vehicle/fly-by-wire
(GCV/FBW) airplane with one pitch control surface, two pairs of roll control surfaces, and
two pairs of structural mode/flutter control surfaces.
The only additional sensor assumed available for this application model is the microwave
landing system (MLS) receiver. The structural mode and flutter suppression requires
accelerometers and rate sensors distributed in the structure of the vehicle. Except for
these structural mode sensor signals, the control force sensor signals, and the servo position
and differential pressure sensor signals, all signals will be either digital or discrete.
The total system organization for this application model will be similar to the configuration
shown in figure 7, with the flight-crucial functions included in the fault-tolerant computer.
It is expected that a basic redundancy level of quadruplex will be required for the CCV/FBW
application.- Full use of microprocessor technology is anticipated in order to contain failure
effects and increase system functional survivability.
The functional complement of the far-term application model was assumed to be as follows:
Flight-crucial functions: Flutter suppression
Structural mode suppression
Fly-by-wire control
Full-time stability augmentation
Flight-critical functions: Category III MLS autoland
Noncritical functions: Same as for intermediate-term model,
plus air-ground data link for ATC
communication; performed by nav-
guidance computer
Aircraft Transient Fault Environment.— Transient phenomena emanating from the airplane
environment can temporarily disable one or more redundant elements in the flight control
system and must be considered in the design of the ARCS, whether for the new-, intermediate-,
or far-term applications. Four potential sources of such transients exist in a commercial
airplane: electrical power bus voltage variations, electromagnetic pulses generated by lightning
strokes, transient sensor signal disagreements, and hydraulic power system pressure transients.
••25
Power transients on aircraft electrical power buses occur for two reasons. Normal transients
are caused by normal switching of loads in the course of operating the aircraft. Abnormal
transients are caused by failures of loads on a bus. Current power quality specifications for
new airplanes require that normal transients have a duration of less than 0.1 second and that
automatic fault clearing keeps abnormal transients to a duration of less than 10 seconds. Up
to 20 normal power transients per hour can be expected during flight operations.
Lightning strokes on Boeing aircraft have occurred on the average of once per 2500 hours.
The duration of the high-energy transient in the discharge is assumed to be no greater than
50 MS.
Transient sensor signal disagreements can occur for two reasons. Static gain differences in
combination with large aircraft maneuvers will cause sensor signal differences, even though
the sensor outputs are within specified tolerances. Tolerances in dynamic response charac-
teristics between like sensors, in combii.ation with installation variances and turbulence, will
likewise result in disparities between nominally like signals.
Transient pressure variations in the hydraulic power systems are caused by large load applica-
tions. Transient pressure and servo position differences between redundant flight control
elements are conceivable since individual hydraulic systems carry different complements of
loads.
The ARCS design shall be structured to remain operational for, and recover from, faults
induced by the transient phenomena described above.
/4.1.1.2 System Functional Readiness
Component failures will occur in the flight control system with an average frequency deter-
mined by the component failure rates. The probability that a given function will remain
operational, or be operational when called for, decreases with the exposure time since the
last functional verification.
For the ARCS program, the following definition for functionalreadiness was made:
Functional readiness is the probability of retaining a function at a certain time assuming the
system is initially fully operational.
The calculated functional readiness for an autoland function is illustrated by figure 8. The
lower curve represents a triplex fail-operational system composed of present-day sensor,
servo, and airplane system technology components and a triple-modular-redundant (TMR)
computer system with a failure rate equal to that of the baseline WWCS. The upper curve
illustrates the ultimate improvement in functional readiness that can be achieved if a system
with identical component failure rates can be designed to survive two like failures, i.e., can
degrade to simplex operation for any two like failures.
26
8-1/2
 
X
 11
 INCH
 CROPS
nvNouv«3do
 9NI39
•'27
For the reliability calculations illustrated by figure 8, constant component failure rates
were assumed. This implies that elapsed time since last verification of fault-free perform-
ance can be substituted for elapsed time since maintenance. Fault-free performance could
be verified by preflight test (typical for FEW) or by the fault-free use of the function
on a previous flight (autoland).
In addition to functional readiness, the concept of function availability is useful for the
specification of ARCS criteria. It was defined as follows:
Function availability is the probability that a function is available for use at any point in time,
given a certain maintenance schedule.
Availability, with this definition, is the average functional readiness over the operating
time-span considered. For the ARCS, the operational timespan of interest is a series
of airplane flight missions from dispatch to gate arrival at the airplane's destination. This
is compatible with the ARINC definition of availability, where the times for active repair
of a system, logistics, and administration are ignored in the context of specifying the
reliability.of a given functional capability.
To achieve acceptable functional readiness and availability, certain operational and
economic criteria must be established. Due to the different criticality of the two major
ARCS functions, functional readiness criteria take different forms for the FEW and the
autoland functions, as described below.
CCV/FBW Functional Readiness.-T^he very nature of the flight-crucial CCV/FBS function
makes it dispatch required; its functional availability at dispatch is therefore set at unity
by edict. However, to ensure an economical design, specific maintenance requirements
must be set forth to limit the cost of maintaining the system in its dispatch-required
condition.
There is a direct relationship between maintenance effort and frequency of line replace-
able unit (LRU) failures. Since every failure in the dispatch-critical function has to be
repaired before the next flight, a readiness curve for a FEW function, corresponding to
that shown in figure 8 for the autoland function, must start at time zero for every
flight where the functional readiness is unity. The only criterion availability to assess
the acceptability of the FEW functional readiness is the maintenance effort required to
keep the FEW functional readiness at unity for every flight dispatch.
The first functional readiness requirement for the CCV/FBW function thus addresses
the cost of maintaining the system. Lacking specific requirements in this area, the
following position was taken for the ARCS:
The effort required to maintain the FEW function in dispatch-required status shall not
exceed the present level of maintenance for dispatch-required flight control functions.
The level of maintenance required for primary flight controls (ATA Chapter 27) on
airplanes operational today is approximately 8 man-hours (line and shop maintenance)
and $50 of material cost per 100 hours of flight time.
The second FEW functional readiness criterion addresses the situation when a first
failure state in the FBW system has been incurred. This condition, which can occur
anytime after the airplane is dispatched from the departure gate and which has a
reasonable probability to happen in a complex function like the CCV/FBW, must not
impair the operational utility of the airplane. The following requirement was therefore
postulated:
No first (in-flight) failure of components involved in the FBW function shall require that
the flight plan be modified for any flight duration of less than 10 hours.
Ten hours was selected to be representative of the longest flight time presently antici-
pated for commercial transports.
In the event several components fail within the CCV/FBW function, a failure condition
could arise that requires the airplane to land within a short time so that safety-of-flight
requirements based on survival probability of the partly failed system are not violated.
Any such diversion constitutes a disruption of the airline operation; therefore, the third
FBW functional readiness criterion addresses the diversion probability. The position
was taken that diversion probability due to FBW failure must not be higher than the
current diversion probability due to primary flight controls (ATA Chapter 27):
Any failure state of the FBW function requiring diversion from the flight plan shall be
improbable.
Improbable shall be interpreted consistent with its use in the FAR's. Although not
defined quantitatively in any regulatory document, a probability lower than 1 x 10"5
has been used by industry to signify an improbable event. The average diversion rate per
departure due to flight control system failure in the United Airlines fleet presently is
approximately 3 x 10~5.
Low-Visibility Automatic Landing Functional Readiness.—The Category III autoland
function is not anticipated to become dispatch required in the foreseeable future. It is
conceivable, however, that given demonstrated operational benefits, an airline may
declare the function flight mandated, i.e., an airplane will not (normally) be dispatched
without repair if the autoland function is known to have incurred a failure state that
would disqualify it from being used on the next landing.
29
The lower curve of figure 8 also represents the predicted functional readiness of an existing
fail-operational triplex analog Category Ilia autoland system currently in airline service. The
actual functional readiness experienced in service with this system closely follows this curve.
This functional readiness history is considered unacceptable on operational and economic
grounds by at least two major international flag carriers. For long "thin" international
routes in particular, aircraft flight time on the order of 100 hours before return to the air-
line's primary maintenance base is not uncommon.
After reviewing these international airline operations and considering our experience with
current systems, it was postulated that a low-visibility autoland function, in order to be
operationally and economically justifiable, must have a function availability goal as follows:
The average probability of having an operational capability required to commit to a Category HI
landing shall be no less than 0.9 for 100 hours between system maintenance.
The number 0.9 was chosen from airline comments on desired Cat III availability; no
recognized airline or FAA standard relates to Cat III availability.
4.1.1_.3_ System Maintainability
Fault-tolerant redundant automatic flight control systems have been introduced into airline
operation to prevent hardware failures from creating hazardous situations during a critical
operation such as autoland. Redundancy is also being used in aircraft systems to enhance
basic system functional availability. While these systems have been developed to achieve
desired safety and functional reliability goals, many complex problems have surfaced
concerning testing and maintaining system operational integrity. These problems have
offset to an uncertain extent the economic benefit of introducing the higher functional
capability.
Fault tolerance requires that faults occurring within the system be detected, localized, and
isolated. The ability to localize faults therefore becomes an inherent requirement for a
system designed to tolerate multiple like faults. As a consequence, the system design must
include mechanisms to test for, and register the existence of, fault conditions as they occur
in operation. In addition, if the computer system is to be tested for the existence of latent
failures, provisions must be made for conducting such a test in an off-line or non-real-time
mode.
The maintenance problem for automatic flight control systems (AFCS) is more complex
than for most other avionic systems. To function properly, the AFCS requires other inter-
facing aircraft systems such as sensors, servoactuators, electrical power generation/distribution !
systems, and hydraulic power systems. A problem with any one of these interfacing systems
can potentially result in degradation in AFCS performance capability, thus requiring that
maintenance action be taken. Experience with contemporary equipment shows that to
maintain such complex, interdependent systems using manual testing and troubleshooting
techniques is a formidable, time-consuming airline maintenance task, and maintenance
effectivenss has fallen far short of desired goals.
30
The magnitude and criticality of the AFCS maintenance problem causes increasing concern
throughout the aircraft industry. Evidence of this concern is given by the recently formed
Airline Electronic Engineering Committee (AEEC) Subcommittee on Built-in Test Equipment
(BITE), whose major undertaking thus far has been to address the area of built-in test
considerations for future digital automatic flight control systems. Furthermore, with the
current economic pressure on the airline operators, it is anticipated that the need to improve
maintenance efficiency will persist.
The identified requirements—on-line fault detection for fault tolerance and improved main-
tenance effectiveness—led to recognizing the need for an improved, integrated system-level
self-test/diagnostic capability. An automated system test function shall satisfy these two
requirements of the ARCS.
System test, as applied to the ARCS, can be viewed as an automated built-in test and diagnostic
function designed to provide fault detection and LRU-level failure isolation capability for
the AFCS including sensors, computers, servoactuators, and interface elements. The ability
of system test to rapidly and comprehensively test the various facets of the ARCS provides
a high confidence in system operational integrity, which is an integral element of fault
tolerance. Furthermore, by automating the maintenance decision-making process, system
test will not only reduce system maintenance time requirements but also improve overall
maintenance effectiveness, thereby reducing system ownership cost.
The following subsections identify the criteria and requirements for the maintenance-
related system test function. These requirements, developed by United Airlines, were dis-
cussed and generally supported by members of the AEEC BITE subcommittee. The criteria
and requirements discussion has been subdivided into five categories: maintenance test
conditions and efficiency, operational constraints, test procedures, fault detection and
recording, and system test/crew interface.
Maintenance Test Conditions and Efficiency.—Cost effectiveness is the most important
parameter to consider in the development of maintenance enhancement methods. A primary
factor in assessing system cost effectiveness is the number of maintenance man-hours required
per flight-hour, where maintenance man-hours include both line and shop maintenance. This
consideration is the basis for many of the ARCS system test design requirements. For
example, any need for periodic or time-controlled maintenance places an unnecessary cost
burden on the maintenance operation, no matter how simple the tests may be. A 5-minute
test required on each one of United Airline's 600 000 annual flights would add up to 50 000
maintenance man-hours. Because of this,, airlines in .general have adaoted. or are converting
to, a maintenance program based on the "condition monitoring" philosophy. Under this
program, maintenance action is initiated only as a result of a failed preflight test, if applicable,
or an in-flight failure reported by the flightcrew. A necessary consequence of this philosophy
is that unless a preflight test is required or unless a squawk was generated on the previous
flights, the system is assumed to be operational and available for service. In keeping with this
philosophy, ARCS maintenance will be conducted on strictly a "condition monitoring"
basis.
31
An assessment of maintenance practices and experience on contemporary flight control
systems highlights one major expense item which, perhaps more than any other, is indicative
of the inadequacies of present maintenance aids and procedures: the "unverified" equipment
removals. A removal (normally incurred in response to a reported flight problem) becomes
an unverified one when either the same squawk occurs on subsequent flights or the removed
unit is found to be fault-free in the shop. Unverified removals are typically a result of so-
called "shotgun" maintenance techniques, wherein equipment is removed and replaced in
an effort to clear a flight squawk without adequate testing and troubleshooting. Typical
unverified removal rates for contemporary flight control system LRU's range from 45% to
70%. These excessive unconfirmed removal rates are costly not only because of the unneces-
sary line and shop maintenance action but because of the expense associated with the
additional pipeline spares.
Test effectiveness was defined as the ratio of faults detected to faults incurred. The effective-
ness goal for the ARCS system test function in detecting and localizing failures within the
computer subsystem has been placed at 99% for a redundant configuration and no less than
95% in a simplex configuration. The effectiveness goal of the system test function in detect-
ing and localizing failures within the redundant sensor and servo systems has been placed at
90%. These quantitative goals are based on experience with inertial navigation systems and
certain digital flight control systems used for military applications. Such goals are in keeping
with current airline maintenance operations as expressed at a recent BITE subcommittee
meeting and illustrated in figure 9, which shows that approximately 50% of the AFCS squawks
Engineering assistance
j Technical specialists
40% •
Overnight
maintenance
; Requires overnight maintenance
| with test procedures and
j support equipment
50%
Gate-level line
maintenance
Routine maintenance
activity
Figure 9.—Maintenance Effectivity
32
are satisfactorily repaired by routine line maintenance, 40% are resolved during an overnight
repair cycle, 8% require the intervention of specially trained technical specialists, and 2%
require engineering action. The system test function is expected to be effective between the
90% level achievable by line maintenance (turnaround or overnight work) and the 98% level
currently achieved by line maintenance plus technical specialist.
The interest of airline operators is not in the percentage of the circuitry covered by system
test but rather in the percentage of equipment removals that are confirmed. This considera-
tion imposes two requirements on the design: (1) consideration must be given to failure
rates when configuring a system test and (2) nuisance failure indications (indications when
no failure was incurred) must be minimized or, if possible, eliminated. Therefore, another
ARCS system test effectiveness goal is that the number of false failure indications be less
than 5% of the total number of failure indications.
Maintenance actions cannot be considered complete until the installed system is checked to
verify that it is operating properly and the aircraft is again available for dispatch. Since
this area is of significant importance to the airlines, any automated AFCS maintenance aid
must address this aspect of the maintenance process. This consideration becomes of particu-
lar importance for an aircraft certified for operation under Category II or III conditions.
Following maintenance action, Federal Aviation Administration regulations require that a
functional verification test be made before the aircraft can be returned to Cat II or Cat III
status. This procedure has typically been a lengthy process, requiring two maintenance
technicians and a variety of ground support equipment. Airlines have explicitly expressed
a need for an automated system test to perform this verification function with a high level
of confidence, where confidence means a high correlation between test results and actual
functional integrity.
Operational Constraints. —Quick maintenance action is essential. In many cases, a line main-
tenance station will not initiate maintenance action for nondispatch modes of an AFCS when
turnaround time is less than 4 hours. This current experience reflects excessive time required
to isolate faults, replace components, and verify system integrity. It should be possible to
conduct system-level maintenance at any line maintenance station that has the required spare
LRU's. United Airlines operational experience shows that 25 minutes is available during
airplane turnaround for testing and repairing the AFCS. In this time it must be possible to
identify the failed LRU, replace the faulted unit, and verify system operational integrity.
To achieve this goal, the airlines press for simplicity in the maintenance operation. Any
maintenance action must be based on simple, sound, clear-cut logic that the maintenance
crew can readily interpret and respond to. A ground maintenance program requiring the
use of ground support equipment (external test gear) generally has a lengthy setup and test
procedure that violates the above defined operational requirements. Therefore, the main-
tenance concept for the ARCS assumes the use of built-in-test equipment only, i.e., ground
tests for maintenance purposes, including verification testing, must be performed by line
maintenance personnel using only the equipment normally installed in the airplane. It must
be designed for operation by line maintenance personnel of a skill level equal to that of
today's typical maintenance technician. Furthermore, the ARCS system test concept for
routine line maintenance of the ARCS, i.e., maintenance action not involving hydraulic
power, is to require only one maintenance technician (two if the test involves hydraulic power).
33
Any requirement for routine preflight testing is strongly discouraged because of the time
and workload burden thus imposed. It is recognized, however, that such testing will be
mandatory for certain flight-critical functions. ARCS functions falling into this category
are command augmentation and fly-by-wire in the intermediate- and far-term applications,
respectively. For these functions, preflight testing will be restricted to only those elements
required for dispatch of the aircraft for a revenue departure. Repair of faulted dispatch-
required items must be made prior to release of the airplane, while repair of non-dispatch-
required items may be deferred. The conduct of a preflight test is the responsibility of the
flightcrew and must not require the support of line maintenance personnel.
.Maintenance Test Procedures. —Figure 10 illustrates the maintenance cycle as seen by airline
maintenance operations. The original detection of a fault condition is normally in flight,
manifested by poor system performance and/or fault detecting/annunciation equipment in
the system. Ground maintenance action is then initiated as a result of a flight log writeup
made by the flightcrew indicating the symptoms of the fault. The flightcrew's responsibility
is to report the existence and observed nature of the anomaly; the line maintenance person-
nel must determine the location of the fault and effect necessary repairs. Before the airplane
can be returned to full operational status, the system operational integrity must be verified.
Verification testing is primarily to ensure that all required interfaces have been reestablished
so that no signal paths are interrupted.
Fully operational
systerrt" :
System
verification
In-flight fault
detection
Ground check-
fault isolation
and repair
Figure 10.—Maintenance Cycle
34
In contemporary flight control systems, a major obstacle in a reasonable and systematic
system checkout is the complexity imposed when numerous specific preconditions, or pre-
test setups, are required for test conduct. Failure to properly set up all required preconditions
leads to erroneous test results and hence possible incorrect maintenance action. This
condition is believed to be a significant contributor to the unverified removal rate for flight
control and associated system LRU's. Though the setup of necessary preconditions is the
responsibility of the crew, the ARCS system test is expected to provide an automatic assess-
ment of the status of the required conditions and to alert the operator when they are not
properly satisfied.
Certain aspects of system test do not lend themselves to fully automatic implementation.
Where such conditions exist, operator intervention with the test process will be requested,
together with appropriate instructions via the system test panel (STP) display. Action
requests and instructions shall be in the form of an alphanumeric text-type display.
Ground testing shall be broken into three major modules for safe and efficient test sequencing:
(1) computer system tests, (2) sensor tests, and (3) servo tests. Transition from computer
system testing to sensor system testing shall be automatic, with no operator intervention
required. Transition to servo testing, however, must require the test operator to satisfy the
servo test preconditions, i.e., hydraulic power, and to ascertain that it is safe to exercise the
control surfaces. (This is the reason for having two technicians conducting tests involving
hydraulic power. One is required to be outside the aircraft to clear all active surfaces.)
Fault Detection and Recording. —Experience has shown that it is highly possible to have fault
conditions that manifest themselves only in flight, due primarily to the dynamic environment
which then exists. Since such fault conditions are difficult if not impossible to duplicate on
the ground, severe problems are created for the line maintenance operation. The problem is
worsened when the flightcrew writeup is unclear or imprecise. The system test design,
therefore, must include the capability to automatically detect and isolate failures, particularly
sensor failures, while in the dynamic flight environment. Such failure information must be
stored for later use by line maintenance personnel so that the need to duplicate the fault
condition on the ground is eliminated.
To achieve this capability, the ARCS system test function must provide pertinent failure
information relative to in-flight fault conditions that is accessible for display as a part of the
ground maintenance operation. This failure information must supply the line maintenance
technician with sufficient information to allow him to localize the fault condition to a specific
LRU. These considerations led to the establishment of an ARCS system test requirement
that a section of nonvolatile memory be provided for storage of this data. Though some
historical record of past failure history may be beneficial to engineering and/or shop main-
tenance, the only failure data that need be provided for line maintenance is that pertaining
to the immediately preceding flight segment.
In addition to in-flight fault detection/recording capability, a means must be provided where-
by a ground crew can initiate an automated system checkout sequence as part of the ground
maintenance operation. This capability will assist ground maintenance by locating permanent
faults within the system, by testing for potentially latent failure conditions not otherwise
detectable by on-line monitors, and by verifying system operational integrity following main-
tenance action.
35
When required, a preflight test procedure shall be provided that can be viewed as a subset of
the total ground test procedure. Failure storage memory shall be provided for recording the
results of this test procedure for display to the test operator.
All failure data storage, whether accumulated as a result of in-flight testing, operation moni-
toring, or ground testing, shall be recorded in a manner to explicitly identify the failure to
an LRU level. Airline operators have asked that the failed LRU be identified by name in
an alphanumeric format, wherever possible. In this way potential ambiguities due to misinter-
pretation can be eliminated.
System Test/Crew Interface. A means must be provided for interfacing the system test
function with the test operator, either the flightcrew or ground maintenance crew. This
interface shall be provided by a system test panel (STP) encompassing all that is necessary
for the operator to initiate the test function, control its progress, interact where required,
and display test results.
The system test panel is intended to be used by the flightcrew for preflight test and for
in-flight display of system operational status and by the ground maintenance crew for main-
tenance-level testing. Since the flightcrew requires access to the panel functions and ground
testing involves the use of certain flight deck controls, the STP must be designed for installa-
tion in the flight deck where it is accessible for both functions.
Insofar as possible, the STP shall be designed so that the ARCS appears to an operator as one
system. It shall be used to enunciate LRU-level failure information, system status or opera-
tional capability, and operator action requirements in an alphanumeric format.
A distinctly different type of information is needed by the flightcrew compared to that
required by ground maintenance. The flightcrew need only know of the existence of a
failure condition if it affects the minimum operational capability (flight-critical mode) or the
currently selected operating mode. The maintenance crew, on the other hand, must know of
the existence of the malfunction and its location.
4.1.2 FLIGHT SAFETY CONSIDERATIONS
Specifying safety-of-flight criteria and goals implies defining the level of accident risk, or
probability of accident, that can be tolerated for the total aircraft and the risk contribution
that can be tolerated because of particular subsystems in the aircraft.
The safety-of-flight criteria and goals, or functional reliability criteria and goals, defined for
the ARCS were developed from three considerations. Two of these, and the basis for the
criteria, are the U.S. and British airworthiness regulations or proposed regulations. Where
interpretations of these regulations were necessary for a specific application, or when addi-
tional criteria were needed to completely specify an operational risk situation, an independent
assessment of the involved factors was made. Since the criteria and goals thus identified will
be of no use unless systems can be economically implemented and operated in a manner to
meet these criteria, the third consideration was the implementation of projected realism
integrated into the criteria definition process.
36
The paths followed in defining safety-of-flight criteria for the CCV/FBW function and the
autoland function are illustrated by figure 11. Starting points are the U.S. Federal Aviation
Regulation (FAR) requirements and certain proposed British Civil Airworthiness Regulation
(BCAR) requirements pertaining to flight control systems. The result of the definition
process is a set of suggested criteria addressing the average functional reliability required for
the CCV/FBW and autoland functions and the specific risk associated with the CCV/FBW
function on a long flight (10 hours). In the process, current certification practice in the
definition of extremely improbable events, the definition of hazardous events, and the
actual current jet landing accident rate are discussed.
4.1.2.1 CCV/FBW Functional Reliability
The general regulatory safety requirement that governs the design of flight control systems
and other equipment, systems, and installations in commercial transports is stated in FAR
Part 25, paragraph 25.1309(b), dated August 5, 1970.
The airplane systems and associated components, considered separately and in relation to other
systems, must be designed so that-
1) The occurrence of any failure condition which would prevent the continued safe flight and
landing of the airplane is extremely improbable, and
2) The occurrence of any other failure conditions which would result in injury to the occupants,
or reduce the capability of the airplane or the ability of the crew to cope with adverse operat-
ing conditions is improbable.
Failure conditions shall include considerations of electrical and hydraulic power loss, including
loss of one engine on two- or three-engine airplanes, and two engines on four- or more-engine
airplanes.
Although not expressed in regulatory documents, a number less than or equal to 1 x 10~9 has
been imposed upon manufacturers by the FAA in recent aircraft certification programs to
represent the probability of an event designated as extremely improbable. An improbable
event by FAA definition is one expected to occur with a probability between 1 x 10'5 and
1 x 10-9.
Insight into how to interpret and apply the "extremely improbable" concept is furnished by
the following excerpt from reference 1, a paper presented by an FAA official:
There have been many careless and inaccurate references to a probability of "ten to the minus
nine, "giving a vague and misleading impression that any failure condition which is to be accounted
for at all must be shown to be extremely improbable: that is, shown by analysis that the proba-
bility of occurrence of such a failure condition is less than I(f9.
This is only true if the effect of the particular failure condition being considered would preclude
any possibility of continued safe flight and landing of the aircraft: or which, in a word, is catas-
trophic.
•37
8-1/2 X 11 INCH CROPS
FAR CONCEPT
EXTREMELY
IMPROBABLE
EVENT
BCAR CONCEPT
AVERAGE RISK/
SPECIFIC RISK
ACTUAL
JET
LANDING
ACCIDENT
RATE
CURRENT
CERTIFICATION
PRACTICE
HAZARDOUS
EVENT
CCV/FBW
AVERAGE
FUNCTIONAL
RELIABILITY
CCV/FBW
SPECIFIC
RISK ON
10 HR FLIGHT
AUTOLAND
FUNCTIONAL
RELIABILITY
Figure 11.—ARCS Functional Reliability Criteria Definition Process
38
The significance of this statement is that if it can be shown that the loss of the
CCV/FBW function is extremely improbable, the functional reliability requirement is
satisfied. In other words, it is not required that all catastrophic failure conditions
(loss of FEW, wing spar failure, etc.) taken together be extremely improbable.
With the background presented above, and taking a Boeing conservative position that
all CCV/FBW function failures are catastrophic, the formulation of the first CCV/FBW
criterion becomes straightforward:
Loss of the CCV/FBW function, given a fault-free system at dispatch, shall be extremely
improbable.
This requirement addresses the average risk (fatal accident probability) over all flights
in all aircraft using the system considered. The average risk is the combined effect
of the flight duration and the system failure probability, which increases with exposure
time as illustrated by figure 12.
For the fly-by-wire application, a mission shall be assumed to start at the time of
dispatch and end at the time the aircraft leaves the runway after landing at the destina-
tion or at an alternate airport. Mission times of up to 10 hours shall be considered,
and an average flight time of 1 hour per flight shall be assumed (based on United
Airlines data).
In addition to the average risk, it is desirable to specify a maximum ceiling on the
specific risk associated with a long flight. A proposed BCAR (ref. 2) suggests that
the specific risk associated with an autoland be no higher than the risk of another
flight, i.e., the average actual in-service accident probability for commercial transports.
Adapting this concept to the cruise situaton would yield (again postulating a maximum
flight time of 10 hours): the specific risk from all causes on any flight of up to 10
hours shall be lower than the total current accident risk.
The total accident data worldwide during 1968-1973 shows approximately four fatal
accidents per 1 million flights. The average flight time was slightly over 1 hour. This
data was extracted from NTSB published accident reports.
Allowing a 5% budget for FBW-related accident causes—the proportion of causes
attributed to flight control systems according to the 1968-1973 accident data—a
second requirement for the CCV/FBW function was formulated:
39
8-1/2
 X
 11
 INCH
 CROPSU
J
0
0
00
CJ
o
c
o
•
-
 I
o
 
o
o
o
 
<
1^3
 
t
-
c
n
 
<
•—
i
 
o
A
rt
o
CO
 
_l
•
—
 
u
.
oUJ
 
CC
Q
.
 
O
CO
 
U
.
0
0
oc
.
oLTV
O•
—
i
-
<
U
J
 
-
•&
 Q
£
LU
otr
LU
 I*
_
 
CO
 I
•O
 
U
J
 ~
 CO
i
 
o;
 a:
 LU
S
 
Q
.
 
C
O
1
—
I
 
LU
 3
U
.
 o
 
<
x
 
o
 <
 o
S
 
K
 
U
J
 
_
|
CM
 LA
 >
 
_i
«
>
 H
 
H
-
•
—
 a
:
 
z
CC
 IS
 
till
is
 u
.
 
a;
ee_i
 
3
LU
 
_i
 
o
-
 
-
 
u
sa:
UJXLCI
 LU
*
 
tef©
•
 
tS
-
'SCO1c\i**^,5.i>u.
40
!
 The specific risk contributed by the CCV/FBW function on any flight of up to 10 hours' duration
shall be less than 0.20 x 10A
I In addition to the average risk and specific risk, which apply to the normal operation of the
: CCV/FBW function, the particular operational circumstances and reliability requirements
I associated with a functional failure state leading to the necessity of a diversion from the
j intended flight plan must be specified. . •
| The pilot must be notified if a failure state occurs that significantly reduces the CCV/FBW
1 functional survival probability below normal levels. This failure annunciation must occur
; such that the overall functional reliability requirement is not violated. The total CCV/FBW
i failure probability P(FBW fail) can be expressed as a conditional probability
i
j P(FBW fail) = P(FBW fail/diversion) • P (diversion)
Where P(diversion) is the probability of diversion. The third CCV/FBW safety-of-flight
requirement then becomes:
The system shall annunciate when a failure state requiring diversion has been incurred so that,
including an additional 30 minutes of flight, the combined event of incurring a diversion and a
system failure after incurring the diversion is extremely improbable.
The 30-minute time period following a diversion decision was specified for the U.S. SST.
4.1.2.2 Autoland Functional Reliability
Although automatic landing is not the exclusive method allowed by regulations to achieve
Category III weather capability, it is the primary one pursued at present. This indicates the
difficulties associated with low visibility and the human pilot's ability to cope with restricted
visual cues during landing. Use of autoland in good visibility—including Category II weather
conditions and better— is strongly supported by segments of the airline industry to improve
overall safety.
Specific regulatory criteria for Category Ilia landing systems are expressed by FAA Advisory
Circular (AC) 120-28A, Section 3, dated December 14, 1971:
Operational Concepts-The total airborne system must be designed and must provide sufficient
information to the pilot so that the landing may be safely continued and completed or a go-around
safely executed from any altitude following any single failure or combinations of failures not
shown to be extremely improbable.
A crucial issue in the probabilistic approach is the definition of the event to which the
probability is attached. For the ARCS program, the conservative assumption was made that
the pilot will not intervene and recover from a system failure that occurs below the "alert"
height in Category III conditions. The alert height is aircraft dependent, but typically
between 50 to 100 feet above the runway.
When the autoland function is used during conditions that provide runway visibility above
Category III minima, the pilot will be credited with at least a 0.99 probability of recovering
from any autoland system failure below decision height and either safely complete the
landing or execute a go-around. This assumption has previously been used in certification
context.
A system failure during autoland is catastrophic only under two circumstances: (1) the
visibility is such that the crew cannot recover from the failure effect or (2) the crew fails to
recover despite favorable visibility conditions. More strictly formulated, the hazardous
events in connection with autoland can be expressed as:
1. Autoland failure given Category III conditions
2. Autoland failure given pilot failure to recover in non-Category III conditions
The hazardous exposure lasts for approximately 45 seconds (from decision height to stop
on the runway).
AC 120-28A requires that each case of hazardous events be extremely improbable. Accepting
for a moment that extremely improbable implies a probability of occurrence equal to or
lower than 10"9, items 1 and 2 can be written:
3. P(autoland failure) • P(Cat III) < 10'9
4. P(autoland failure) • P(pilot fail) < 10'9
The probability of incurring Category III weather in the U.S. is approximately 0.5%. Assum-
ing that this probability worldwide is less than 1 % and recalling the assumption that the
pilot will recover from at least 99% of all autoland malfunctions in non-Category III condi-
tions, items 3 and 4 yield the same result:
P(autoland failure) < 1 x 10'7
Therefore, for the autoland function, a failure probability lower than 1 x 10'7 for a 45-second
exposure time must be achieved to meet the FAA requirement.
The corresponding BCAR requirement for autoland average risk is formulated as follows:
42
The system shall be such that the total fatal landing accident rate (i.e., average risk) due to the
use of the system at any time and in the new visibility conditions permitted below current minima
(approx. 200 ft and 1/2 mile) shall not be greater than the present total fatal landing accident
rate for all transport aircraft. This figure is believed to be of the order of one fatal accident
per million landings. Since piloting is only one of several possible causes of fatal landing
accidents, the system should not contribute a rate greater than 1.0 x 10~7 fatal accident per
landing.
The actual total fatal landing accident rate for worldwide jet transport operations during
1968-1973 was approximately 2 x 10'6.
The BCAR concept deals with the use of the system, which implies two potential causes for
a hazard: risk due to system malfunction and risk due to performance. If 10% of the
landing risk from all causes is allocated to piloting and this 10% is split equally between
autoland malfunction and performance, a resulting required probability of malfunction
lower than 1 x 10"^ is obtained.
Thus, the functional reliability requirement established to satisfy the FAR criterion equally
satisfies the BCAR criterion.
4.2 ARCS DESIGN REQUIREMENTS
The operational requirements identified in the previous sections need to be translated, inter-
preted, or expanded to take a form in which they can serve as design requirements for the
ARCS. The most significant aspect to recognize in this translation process is that the computer
system is an integral part of a total flight control system; it must be designed and optimized
in that context.
The ARCS development takes off from the state of digital flight controls technology
achieved to date. The GE MCP-703 (WWCS) system, which is the baseline system for the
ARCS, represents the third generation of flightworthy experimental digital flight control
computers, developed and tested extensively. The ARCS design requirements formulated
below have evolved from this development experience. They purport to define the most
desirable direction in which to pursue a fault-tolerant computer system for a critical flight
control application on a transport aircraft. The three ARCS design requirements—system
functional design, software design, and hardware design—are shown in figure 13 and are
discussed further in the following sections.
4.2.1 SYSTEM FUNCTIONAL DESIGN
The basic design objective for the ARCS was to develop a highly fault-tolerant, redundant
channel system for application in an airborne environment. The fault tolerance includes
the ability to withstand transient fault conditions by temporarily degrading to a lower
redundancy level, with recovery of redundancy when the transient fault condition disappears,
and the ability to remain operational after two like component failures by degrading from
triplex to duplex to simplex.
43
8-1/Z X 11 INCH CROPS
SECTION 4.2.1
FUNCTIONAL
DESIGN
FAULT
TOLERANCE
OPERATION
SYSTEM
TEST
OPERATION
SECTION 4.2
ARCS DESIGN
REQUIREMENTS
, SECTION 4.2.2
SOFTWARE
DESIGN
SECTION 4.2.3
HARDWARE
DESIGN
Figure 13.—Elements of ARCS Design Requirements
44
In the past, fault-tolerant analog automatic flight control systems have been designed to survive
one (triple-channel fail-operational) or two like (quad-channel two-fail-operational) failures. A
required effort included in the design task was to prove that any conceivable like component
failure beyond the first or the second failure, respectively, would result in a safe system shutdown.
Because of the nature of the digital system, with timesharing of hardware and multilevels of
dependency between elements of the system, the relatively simple analysis used for redundant
analog systems is no longer realistic. Instead, a probabilistic assessment approach must be
taken to design integrity into the system and to analyze the resultant design. This probabilistic
approach must be reflected in the formulation of design requirements.
A significant distinction exists between the traditional and the present design problem: The
purpose of the ARCS design is not to achieve strictly two-fail-operational fault tolerance (in
a triplex configuration), but to achieve a low probability of system failure. In other words, a
system failure after only two like failures is acceptable provided the probability of that
occurring is sufficiently low.
The discussion of ARCS requirements rests on a set of key terms, defined in section 4.2.1.1.
The design requirements for the fault-tolerant operation are covered in section 4.2.1.2.
System test, a crucial aspect with respect to ensuring that the basis for the probabilistically
derived conclusions on system reliability and safety indeed exists at crucial points during
system operation, is treated in section 4.2.1.3.
4.2.1.1 ARCS Key Definitions
The total flight control system is comprised of modules, or sets of elements, performing a
specified function. The system is organized vertically into stages, or sets of redundant
modules, and horizontally into channels. A channel is a unique set of modules, which
together are capable of performing the system function.
The ARCS includes modules such as the processor, memory, input/output, power supply,
and system test panel. Modules interfacing with the ARCS are sensors, mode select panel,
servos, and displays.
A. fault is defined as a performance anomaly. A transient fault is a temporary performance
anomaly, while a failure is a permanent fault.
Tolerance is the ability of the system (or stage) to continue to perform the required function
given a fault. Coverage is the conditional probability that, given a failure, the system (or
stage) continues to perform the required function.
Reconfiguration is the process of attempting to tolerate a fault. This process has two
possible outcomes: recovery or redundancy degradation. Recovery is the reestablishment of
the operational level of redundancy that existed prior to the occurrence of a transient fault.
. Redundancy degradation is the reduction of the operational redundancy level of a stage.
;45
A fault whose presence can be detected by implemented self-testing and/or monitoring
procedures is called a detectable fault. The discovery of the existence of a fault is termed
detection. A. latent failure is a failure that has not been detected.
Localization is the identification of the particular module in which a detected fault has
occurred, and isolation is the setting apart of a faulty module in such a way that it cannot
affect the system output.
4.2.1.2 Fault-Tolerant Operation
The basic design objective for the ARCS computer system is to minimize hardware
redundancy yet maximize fault tolerance in the event of (1) computer system, sensor
system, and servo system component failures and transient faults and (2) other transient
interruptions that influence the system operation in the airplane operational environment.
The first major assumption made for the ARCS definition study is that one basic organiza-
tion, or architecture, for the electronic flight control system can be developed to effectively
handle the full spectrum of presently envisioned functions. Based on existing experience in
fault-tolerant flight control system design, it is assumed that a successful candidate ARCS
will exchange sensor information between channels to achieve maximum system reliability
and that the system will utilize cross-strapped computer information for redundancy manage-
ment and reconfiguration purposes. The following ground rules and requirements anticipate
such a system.
The most disastrous form of a failure mode in an ARCS would be where channel dependence
could be a source of a single-point failure that could cause system failure. It is therefore of
critical importance to guarantee channel independence in the probabilistic sense of
"independent events." The first three requirements below address this particular concern.
1. Channel self-dependence—System and channel status must be registered in each channel
in order to have the potential for achieving fail-operation from dual to single and
reconfiguration back to higher orders of redundancy.
Requirement: Each computer shall independently assess its own operational status.
2. Channel integrity— No single-point hardware failure and no software routine in one
computer shall have the potential to interrupt the operation of another computer by
bypassing the other computer's decision-making routines.
Requirement: . No computer operation, or combination of computer operations,
shall interrupt the normal operation of another computer.
3. Fault propagation—Failures shall not have a potential to propagate across channels.
Requirement: No servo shall be controlled by processes outside its own channel.
46
High failure coverages are essential to achieve the system reliability demanded by the opera-
tional requirement. A first-failure coverage of unity is deemed a realistic design goal. The
following items address the fault tolerance of the system.
4. Output voting—Based on considerable experience in the technology of mechanical servo
output voting, the following rule was adopted.
Ground Rule: A mechanical servo output voting node, providing a system coverage
of unity for any first-failure condition in a triplex configuration,
shall be assumed for the ARCS development.
5. Second-failure coverage—Computer self-test and wraparound test have the potential
of providing a thorough fault-detection capability.
Requirement: A second-failure module coverage of 0.95 or better is a design
goal for the computer and interface modules.
6. Simplex-failure coverage—Coverage of a failure occurring in simplex operation will be
of importance in decreasing the probability of an active system failure.
Requirement: A simplex failure-detection probability of 0.90 is a design goal for
the computer and interface modules.
7. Sensor signal selection and failure detection—The sensor selection algorithm must be
able to isolate the effects of all types of faults, including open failures, hardover
failures, and any ramp failures and oscillatory failures.
Requirement: The signal selection algorithm must provide an output that is
acceptable to the application task for normal (unfailed) operation
and during any fault condition in one of the redundant input
signals.
Requirement: Means must exist to detect and isolate the effect of a sensor
failure, as well as to revise the failure-detection algorithm to be
compatible with the lower redundancy, before the probability of
incurring an additional like failure may be high enough to create
an unacceptable risk.
Requirement: The failure-detection algorithm must operate in the presence of
normal signal tolerances such as biases, scale factors, and linearity
errors, and in the presence of noise.
Requirement: The proportion of "nuisance failures" (leaky transients) due to
signal tolerance and noise must be insignificant compared to the
number of genuine failures.
47
8. Automatic start and restart—The system must be designed with a capability to recover
from massive transient faults, which could be caused by phenomena such as lightning
strokes and which can result in simultaneous disagreement between all voted or
monitored signals.
Requirement: The system shall be designed with the capability of automatic
start and synchronization after power turn-on, as well as automatic
restart and ^synchronization after transient power faults or
massive transient signal faults.
9. Pilot intervention—Transient fault conditions are a major source of system degradation
in today's redundant systems. Tolerance against transient fault effects is a major design
requirement for the ARCS.
Requirement: Pilot intervention shall not be relied upon for system reconfiguration
processes.
Requirement: System start, restart, and synchronization shall not require pilot
intervention.
10. Allowable recovery time—The time allowed for reconfiguration from duplex to simplex
before declaring system failure is a significant parameter for the fault-tolerant computer
system. This time is dependent on the particular vehicle application. For the ARCS
functional success analysis a time period longer than 1 second between fault occurrence
and successful system reconfiguration was selected as the criterion of system failure.
It is judged that vehicle control can be recovered if a critical function is restored within
this time period.
Requirement: The system must return to successful reconfiguration within 1
second after fault occurrence.
4.2.1.3 System Test Operation
System test encompasses functions necessary to ensure the integrity of the fault-tolerant
system. Reliability predictions, on which the operational risks during the use of the system
are based, rest on assumptions about the failure status of the system at any particular time.
These assumptions must be validated continuously through a thorough verification that all
failures in the system are known as soon as possible after they occur.
It follows from the close relationship between finding a failure and effecting its repair that
system test for the purpose of maintenance action is inherently part of the same problem
complex. System test is therefore considered as one function with two purposes: integrity
assurance and maintainability enhancement.
The ARCS system test function shall encompass ground test, in-flight test, and in-flight
monitoring with the following specific purposes:
48
• The ground test function shall be organized as three distinct phases: (1) a self-test of
the computer subsystem, (2) a test of the sensor systems, and (3) a servo subsystems
test. Sensor and servos shall be tested both statically and dynamically wherever possible.
Any preflight testing of the ARCS shall be a subset of the ground test function.
• In-flight tests shall be conducted where necessary to cover subsystem failures that may
not be detectable by existing system monitors.
• In-flight monitoring shall be used whenever similar data are available from separate
independent sources to detect and isolate failures. When a sensor stage that is less
than triply redundant is monitored, the technique of "pseudo-sensor monitoring" shall
be used to augment the normal cross-channel comparison. Pseudo-sensor monitoring
refers to the practice of deriving sensor data from a related but not identical source.
The system test function shall be self-contained in each channel so that it is operable regard-
less of the number of operational channels. Control of the function and display of results
shall be the same whether the system is triplex, duplex, or simplex. The control and display
panel (CDP) shall interface with the computer in each channel.
Manual initiation of the test function shall be prohibited in flight in order to prevent an
inadvertent activation of a test that would disrupt the normal system operation. The only
manual input to be acknowledged in flight would be a read request that would display the
current operational or failure status of the system.
The ARCS system test function shall include the total flight control system in the following
testing sequence:
Computer system test
Input and output test
Sensor tests
Servo tests
Computer system test—As part of this test, cross-channel data links and independent
computer monitors shall be tested. A processor self-test program shall verify proper
operation of each computer instruction by testing each microstate. The operation of
the processor clock shall be continually monitored. All data being read out of variable
memory shall be tested for correct parity. The occupied portion of the variable memory
shall be sequentially read out to test for invalid parity. The contents of each location
hi the nonvariable (program) memory shall be added together and compared with a
prestored value to assess the status of the memory.
Input/output tests and sensor testa-Wraparound loop tests shall be used to verify the
multiplexed portions of analog and discrete input/output signal paths. Sensor self-test
functions shall be exercised as a part of the ground tests to verify basic sensor availa-
bility and interface integrity.
Servo tests—Ground testing of the servo systems shall verify the engagement/disengage-
ment function, test the dynamic response, and verify the force voting characteristics.
49
4.2.2 SOFTWARE DESIGN
The operational software is an element of the fault-tolerant computer system that is common
to all the redundant channels. A software error will not cause a disagreement between
redundant channels, will therefore not be detectable by the system, and thus constitutes a
potential single cause of system failure.
In principle, there is no difference between hardware design/mechanization errors and soft-
ware design/implementation errors. The only possible way to eliminate such errors is by
careful design, clearly defined mechanization or implementation standards, and testing.
Reliable (correct) software is an obvious requirement for a fault-tolerant computer system.
Reliable software requires a strict software design methodology. The software design
methodology adopted for the ARCS design process must recognize that the single most
important factor in achieving reliable software is to maintain design and implementation
visibility on all levels throughout the software development process. This visibility is
necessary to ensure that all requirements are being correctly interpreted and met and to
provide continuity across interfaces between personnel involved in the development, testing,
certification, and operational maintenance of the system. The relationships between these
processes are illustrated in figure 14.
The first interface area—software design—is illustrated in figure 15. The software designer
will develop the software requirements in cooperation with the control system engineer,
who is responsible for the specification of the overall system and its performance. After the
software design is complete, the designer must present his design in a format that will serve
three purposes: it will be the input to the programmer for implementation into code, it will
be an input to test planning for the verification and validation process, and it will provide
feedback to the control system designer.
The second area of interface, illustrated in figure 16, concerns the certification and operational
processes. The engineering representatives of the certificating authority will interface with
the system and software designers during the airworthiness validation phase of a new system's
introduction on a commercial transport. Finally, the operator's Engineering department
will need to know the design to a certain level in order to maintain the equipment.
The software design methodology used to describe the ARCS design must provide the
visibility needed for the above purposes. It must result in unambiguous, rigid software
definitions to ensure high confidence in the integrity of the fault-tolerant system and to
provide design visibility necessary for certification of flight-safety-critical functions imple-
mented in, and described by software.
4.2.3 HARDWARE DESIGN
In addition to satisfying the environmental design requirements specified for airborne avionics
in reference 3, the following specific ARCS or flight-controls-oriented requirements must be
met.
50
-1/2X
11
 INCH
 CROPS
UJ»
-
o<
I11
UJ
 O
CC
 LU
 h
-
«
»
£
•-
'
U
.
 O
 U
-
O
U
J
 U
J
CO
 CC
.
 Q
»
-
 z
CO
 O
 U
.
>
 
U
J
 U
J
co
o
e
 Q
51
: Software
i design
: document
Independent
;
 verification and
validation engineer
Figure 15.—Software Design Interfaces
52
System, software
designers
System design
description documentation
test reports
Operator's
Engineering
Regulatory
agency.
Engineering
Figure 16.—Certification and Operation Interfaces
53
4.2.3.1 General Architecture
The final detail configuration of flight control functions can be established only through
flight testing. For an efficient flight test program requiring a minimum turnaround time
for software changes and the potential for making in-flight software changes, the computer
system must be capable of working with an electrically alterable memory such as a core
memory.
For the fully certified production-type system, on the other hand, a completely nonvolatile
program memory is anticipated to achieve the highest possible integrity and fault tolerance
in the system. Therefore, the processor-to-memory interface for instructions and constants
storage shall facilitate both core and ROM types of memory.
The computer system shall be applicable to flight control systems performing functions of
various mixtures of flight criticality and mission lengths, resulting in different requirements
for redundancy. The ARCS hardware shall therefore be adaptable to at least quadruplex
redundancy without requiring any changes to the architectural design.
An initial memory size estimation of 16K has been identified to implement the software
system for the ARCS far-term application model. To provide a sufficient growth margin,
the memory shall be expandable to a minimum of 32K.
The word length required for flight control computations is 16 bits. For strap-down inertial
navigation, 24-bit computations are required for certain calculations. Because of the poten-
tial of applying the ARCS computer to strap-down computation, a basic word length of 16
bits with the capability of efficient double precision, or a basic word length of 24 bits, is
required.
4.2.3.2 Processing Speed
The far-term application model for the ARCS requires an estimated 425 000 operations
per second (425 kops). To provide a sufficient margin for development and growth, a
computational capability of 600 kops shall be the goal for the far-term application model.
The processor timing shall allow minor cycle times down to 10 ms to satisfy the structural
mode control requirements of the far-term application model. As a preliminary requirement
for transient fault recovery, a cross-channel data transfer rate of 50 words/ms shall be
anticipated.
4.2.3.3 Design Integrity
The ARCS shall be designed to withstand and suppress the effects of electrical hazards
including lightning stroke. The ARCS hardware design shall be constructed to survive a cable
failure as a single failure, where all the wires in a cable may be open-end or shorted to
ground. It shall also survive, as a single failure, the computer LRU sliding out of the receiving
tray, where all pins on the LRU connectors disengage almost simultaneously.
54
;5.0 ARCS DESIGN CONCEPT
The ARCS was designed from criteria and requirements spelled out in section 4. It is con-
venient to describe the resulting design from three distinct aspects: functional, software, and
hardware. This yields a top-down definition of the system concept as indicated by figure 17.
The functional description is the top level—a direct expansion of the design requirements
into functional concepts. From these functional concepts (sec. 5.1), the design is described
in progressively increasing detail as it is implemented in software and hardware. The
software design (sec. 5.2) was pursued to a stage from which coding can be initiated, whereas
the hardware design (sec. 5.3) stops short of the final detail design definition required for
hardware fabrication.
5.1 FUNCTIONAL DESCRIPTION
On the highest functional level, the ARCS performs four distinct, tasks. The first and fore-
most, since it is the ultimate justification for the system, is the application task of controlling
the aircraft. The design of aircraft control functions is not, however, the object of the ARCS
development, although performance of those functions was a consideration integrated into
all ARCS design decisions.
The second task of ARCS, and the primary forcing function for the fault-tolerant design,
is that of providing maximum functional survivability in the presence of transient and
permanent fault conditions. Inherent in the fault tolerance is also the third task, namely
to ensure system integrity with respect to flight safety. Discussion of those elements that
comprise the fault-tolerant aspects of the ARCS design make up the bulk of the functional
description.
The fourth task of the ARCS is that of fault status verification. Because of the maintenance
aspects and their significance to airline operations, a system test function was developed as
an integral element in the ARCS design. However, certain aspects of fault monitoring and
system failure status are inherently a part of the system integrity complex. System test is
therefore, by necessity, integrated with the fault-tolerant design aspects of the ARCS.
5.1.1 SYSTEM RECONFIGURATION
Reconfiguration has been defined as the process of attempting to tolerate a fault. In dealing
with the various fault conditions that can cause a transition between possible states in the
reconfigurable system, the process of power-up was considered as one of the reconfiguration
processes. In so doing, a simple, straightforward reconfiguration strategy was adopted to
handle all such transitions. In this strategy, power-up of a second channel is identical to a
recovery from a transient fault in duplex operation, and power-up of a third is equivalent
to a recovery in triplex.
The general reconfiguration process for the total system involves one or more of the sub-
processes of fault detection, fault localization, fault isolation, recovery, and redundancy
degradation. Figure 18 shows the possible outcomes as a result of faults occurring in the
system. Transient faults will result in restoration of the operational state that prevailed
•.'55
1-1/2
 X
 11
 INCH
 CROPS
L
£
 
i
i
 
i
§8id1oTSi
QO
I
 
I
56
8-1/2
 X
 11
 INCH
 CROPS
QLU
Z
 C
O
O
 U
J
•
—
 
a:
h
-
<
 LJJ
OCL
 
I
—
LU
 
<
a
.
 i
-
o
 in
>
-
 o
c_)
 
—
•
Z
 I
-
«Q
Q
z
<
ID
 or
Q
 ID
UJ
 UJ
Q
:
 Q
ceUJ
UJ
 CO
a
:
 h
-
o
 <
U
J
IDCOCO
 
f-i
>
-
 <
CO
 U
_
U
J
 O
Q
 
—
U
J
 Z
 h
-
I
—
 LU
 O
<
C
L
 
<
|
—
 LU
 LU
co
 Q
 o
:
;
 a!.§ 1 c•
QLUI—
 CO
O
 L
U
LU
 o:
I
-
 ^
2
L
U
_
J
Q
 
—
•
z
 <
O
c
:57
before the fault occurred. Failure of the recovery process, i.e., the fault is declared perma-
nent, will result in a degradation in the redundancy state of the affected stage. A detected,
localized, and isolated permanent fault in triplex and duplex, an undetected permanent fault
in triplex, or an unrecovered transient fault condition will all result in redundancy degrada-
tion. A detected permanent fault in duplex could result in either a redundancy degradation
or a system failure depending on whether the fault is localized and isolated. A permanent
fault in simplex always means system failure whether it is detected or not.
Crucial to the reconfiguration process is fault detection. Five first-level monitoring functions
in the ARCS will immediately, upon activation, initiate action leading to reconfiguration.
Three of these first-level monitors—the computed command output monitor, the sensor
signal selection/failure detection (SSFD), and the servo monitor—provide selective fault
detection (see fig. 19). The watchdog monitor protects against gross faults that render the
logic capability of the processor unusable. Whereas these four monitors indicate the
occurrence as well as the disappearance of faults, the power monitor's primary function is
only to indicate that a particular fault condition—loss of power—has ceased, so that recovery
can be initiated.
Figure 20 shows in more detail the functional relationships between the first-level monitoring
functions and other primary processes involved in the ARCS reconfiguration. The SSFD and
output monitor are algorithms implemented entirely in software, whereas the servo monitor
is an independent hardware monitor augmented by software functions. The watchdog moni-
tor, which has a primary mission of monitoring the proper execution of software functions,
is an entirely autonomous, fail-safe hardware function. All these primary monitors inter-
face directly with reconfiguration process control algorithms—collectively called redundancy
management for the purpose of illustration. The SSFD, output monitor, and in part the
servo monitor use cross-channel data for comparison in triplex and duplex modes of
operation.
In addition to these first-level monitoring functions, hardware monitors, such as a RAM
parity error detector and a CPU arithmetic error detector, and a continuously operating
self-test function provide second-level monitoring. Information from these monitoring
points will be assembled in a fault assessment table. In situations where the first-level
monitors indicate a fault but do not provide sufficient information to make a logic decision
to initiate reconfiguration, the redundancy management process will refer to the fault assess-
ment table to seek information for fault localization. This will be the standard procedure in
duplex operation for an output monitor trip.
Four strategies of reconfiguration are involved in the ARCS. Each is associated with a type
of fault or operational circumstance (such as power-on) that forces the reconfiguration
process. Power disruption affects the system on a per-channel basis. Faults manifested as
sensor signal anomalies are contained by the voting node of the SSFD. Faults within a
computer cause output monitor trips per individual output commands, massive output
monitor trips, or a watchdog monitor trip. Faults within the servo loop are handled on an
individual basis by the associated processor or by the associated independent servo monitor.
58
8-1/2
 X
 11
 INCH
 CROPS
LUfc
Qt
-
<
ce
.
ov>v>
o
<
o
u
-
oeoco
o
Z
U
-
LU
 CO
co
 to
o
:
h
-o
3
1
-
Q
.
 <
—
1
-2
3
0
O
S
:.§! £.1:*C:coocooi.aLU
o
 o
:
O
H
C
0
3
2
0
.
a:LLl
 CO
a
.
 a
.
§5<JO
CO
01
 LU
<
C
O
U
-O
o
a
:
co
 a
.
159
M
/Z
X
1
1
 INCH
 CROPS
-
1
.1
60
Two characteristics of the general ARCS configuration governed the choice of strategy for
power-on/power-fault recovery. First, the power system of the aircraft is organized on a
per-channel basis, with individual circuit breakers for sensors, computers, and servos. The
strategy therefore was to accommodate switching on power in any sequence, without any
constraints as to intervals between the different switching actions. Second, to minimize the
number of line replaceable units and thus cost, the ARCS configuration uses dedicated sensor
interfaces per channel with digital, multiplexed cross-strapping of sensor data to facilitate a
one-box-per-channel design for the computer. This, however, introduces a sensor dependency
on the computer.
The ARCS power-on/power-fault recovery strategy is illustrated in table 1. Each computer's
redundancy management keeps a permanent failure flag raised for each of the channels with
which it has not achieved frame synchronization. A single operating channel (channel A in
table 1) will have B and C permanent failure flags set to mode all its redundancy management
processes into single-channel operation. Upon establishing synchronization with a second
channel (channel B in table 1), the B permanent failure flag will be removed, the B do-
not-use flag will be set, and channel B's processing will be initialized using state variables
transferred from channel A.
To prevent the recovery of one computer from interfering with the normal operation of
the other computer(s), all state variable data required for recovery is transmitted cross
channel every frame. Among the variable data transferred from A to B are do-not-use flags
from channel A's SSFD, initially causing both channels to operate on the A sensor data only.
As soon as channelB sensor data comes within the SSFD recovery thresholds with A sensor
data, the SSFD processing in each channel will mode into duplex operation on a per-sensor
basis. Thus the sequence of operation illustrated in table 1. At any time during this process,
the logic states in both channels' redundancy management will be identical. Applying
power to the third channel will cause the equivalent sequence of events, leading to triplex
operation. Loss of synchronization in triplex or duplex will cause the permanent failure
flag to be set, thereby causing all redundancy management processes to revert to duplex or
simplex modes, respectively.
The output monitor is a software process that compares the computed outputs of a processor
with the computed, cross-channel transferred outputs of the other processors. If an output
comparison does not agree within the monitoring threshold, an output monitor flag is set
for the affected output.
The reconfiguration strategy following an output monitor trip is outlined in figure 21. The
same general scheme applies to both the faulted and unfaulted computer(s) in triplex and in
duplex. The process shown in figure 21 represents one pass through the program that will
be repeated each frame as long as an output monitor disagreement exists. The following
describes the process.
61
i-1/2
 X
 11
 INCH
 CROPS
tofcX0}o00>ex:+•>15ft)1ooi<Uo0.1-X
U)(3'<!_JU.Mltc.3
•
'5. 1
-
'
 
Zuizcc
:
 UJ
!
 O
_
Zgcc. UJ
i
 0
-
i.O
^
~
^
~
~
~
~
~
-
_
,
 
'
UCQ
,
 
r
 —
 I
 
[
 
.
 
|
'
1
•
-I
 
C
D
 
0
C
D
 
C
D
 
C
D
—
i
-
 
i
•
 
Ij\ino&
.
LUC/5
fVLL
-
U
JQ
-
0COOCoCO•z
.
LUCO
<
 
<
 
C
Q
 
.
 
<
 
C
Q
T
 
V
 T
 
T
 T< i^
<t
 
<£
 
CQ
 
<t
 
CQ
1C
 
T
 
T
 
T
<
 
<
 
C
Q
 
<
 
C
Q
2
 
-1
!
 
•
 
>
 
a
:
 
<
LU
 
L
U
 Z
j
 
'
 
t|
-
 
_i
 
.
 
|
—
 C
D
^
^
 
1
 1
 
^
^
•
 
Jl
 
U
-
 
•
—
 '
X
 
(_)
 
<
 
C
O
 
>
-
.
 
LU
 
x
 
<
 
a
:
_J
 
LU
 a
:
 
xo
r
 
uj
Q
-
 
_
iuj
 o
 
-
 
U
J
 o
 
>
s
:
 
a
.
 t
-
 z
 
_i
 co
 o
•
—
 
rs
 u
.
 >
-
 
o
_
 z
 
o
tO
 
Q
 
<
t
 C
O
 
3
 
L
U
 U
J
QCO
 o:
62
3-1/2 X 11 INCH CROPS
OUTPUT MONITOR TRIP
(2 OF 3 IN 3x)
COULD
THIS CHANNEL
BE FAULTED
9
3 IN 3X
2 IN 2x
CONTINUE PROCESSING:
EXAMINE FAULT TABLE
ANY
LOCAL FAULT
REGISTERED
9
YES
INHIBIT ENGAGEMENT
OF AFFECTED SERVOS:
RECOVER DATA FROM
UNFAULTED CHANNEL
NO ^  FAULT
PERSISTING
9
IS
PERMANENT
FAULT DECLARED
9
SET PERMANENT
ISOLATION FLAG
Figure 21.—Strategy Following Output Monitor Trip
63
First, a computer decides if it could be the cause of the monitor disagreement. The only
time it can immediately conclude that it is not the faulted channel is in triplex when a two-
; out-of-three decision can be made. Concluding that it can be faulted, a processor will check
sits fault status table for flags that can be associated with the faulting output. If such a fault
is registered, the servo management process disengages the affected servo, and recovery is
attempted by utilizing variable data from an unfaulted computer to initialize processing
during the next iteration.
If the fault persists, as indicated by a continued trip of an output monitor, the described
process is repeated for a predetermined number of iterations after which the fault is declared
permanent. When this has occurred, a permanent failure flag is raised declaring that the
affected output is permanently disabled and the affected servo will be permanently disengaged.
If the fault indication disappears before the permanent fault decision occurs, the affected
servo will be reengaged.
If a processor's fault status table has not registered a fault that can be related to the output
monitor's trip, or if the processor is one of the two unfaulted processors in triplex, the
routine to check for a persisting fault and subsequent setting of a permanent isolation flag
will still be performed. This strategy provides each operating processor with an independent,
permanent failure assessment of the total system.
The result derived from the computer output monitor is an indication of a computer fault
and the channel in which the fault occurred. This information is stored in the system failure
status table where it can be used by the redundancy management and system test functions.
Failure data accumulated in this table is used by system test to record the identity of the
failed element for later use by ground maintenance.
The third functional module for which performance monitoring is provided within the ARCS
architecture is the servoactuation system. Servo monitoring is provided on two levels—a
software monitor of the three coil sum currents per actuator when in a triplex configuration
and an independent hardware monitor for each actuation channel. Coil sum currents are
proportional to the pressure differential across the actuator and.are therefore indicative of
the error that exists between that actuator and the force-summed output. By performing a
two-out-of-three vote of the three coil sum currents, failure coverage is provided for those
failure conditions not otherwise detectable by the independent hardware monitor or the
computer output monitor. A failure thus detected will result in a computer-commanded
shutdown of the associated actuator.
When a failure condition has been detected by the independent servo monitor, the only
indication available to the system software is via the servo engagement discrete. Once a
channel disengagement has been effected, the cross-channel monitor of coil sum currents is
no longer of use since the bypassed servo will passively track the two good channels. Failure
information generated as a result of servo monitoring includes the. results of the coil sum
current comparison and the detected disengagement status. These pieces of failure data are
stored in the system status table for use by the redundancy management and system test
functions as indicators of the failure status of the servoactuation system. Since there is a
software servo monitor provided for each axis of actuation and an independent hardware
monitor for each actuator, fault localization can be effected to the individual actuation
channel.
64
Cross-channel monitoring concepts of the baseline ARCS are based on frame-synchronous
operation between channels. The synchronization concept uses a software routine in
conjunction with a cross-channel discrete to establish and maintain frame-synchronous
processing in all three computers. A synchronization indicator (discrete) is generated by
each computer and transmitted to each of the other computers at the beginning of each frame
cycle. The sync indicator has the following significance:
• When SET, it informs the other channels that the local channel is initiating the synchroni-
zation process.
• When CLEARED, it causes the frame timing reference counter to reset and begin count-
ing out the next iteration period.
• If not SET and CLEARED within a time interval tolerance of the frame real-time period,
the local watchdog monitor will trip, indicating an irregularity in the computer's real-
time operation.
The local sync routine accumulates failure information about the other computers by testing
whether their sync indicators are set and cleared at the appropriate times. This information
is passed to the local redundancy management process, which will use the sync information,
as well as data from other sources, to arrive at the overall system failure status. The sync
routine is processed once every minor frame, assessing the other channels' operations each
frame regardless of their previous failure status.
The watchdog monitor is an independent, fail-safe monitor of the real-time operation of
each computer. If the measured time interval between the clearing of consecutive sync
indicators is not within specified upper and lower limits of the nominal frame time period,
the watchdog monitor will indicate a computer fault condition. If the computer's sync
indications return to a periodic interval, the watchdog monitor will clear the fault indication.
A cleared state of the local watchdog monitor is required for the independent servo engage
logic to respond to engage command discretes from the computer. When, as a result of a
fault, the watchdog monitor trips, all servos in the local channel will be disengaged. If the
fault clears and the watchdog monitor is subsequently reset, the computer must reengage all
the local servos.
The relationship between the minor frame iteration timing reference, the sync indicator setting
and clearing, and the watchdog monitor function is shown in figure 22. A timer interrupt
occurs when the frame iteration timing reference reaches the end of a frame time count
initiating the sync routine. The local sync indicator is subsequently set and, while the local
indicator is high, the local channel is testing the other channel's sync indicators to determine
if they are ready to sync. When all indicators are set, ail three channels clear their local sync
indicators essentially simultaneously. This resets the iteration timing reference in each
channel and all three computers are thus synchronized. The processors then independently
execute their in-line program until the timing reference again reaches the end of the minor
frame time count.
65
8-1/2
 X
 11
 INCH
 CROPS
OOSUJ
V)cc.
u
<
zc_
>
>
--
-
C
O
Q
MO
o
UJ
0
0
PQUJ
'
 
u
r
*
~
^
\
I•sl-ilid _oillOQitsill
66
Should the processor fail to set or clear its sync indicator, the watchdog monitor will trip
and the minor frame iteration timing reference will not be reset. The iteration timing circuit
continues to count and will cause a local interrupt every time it overflows. If the computer
is able to respond to these interrupts, they will be interpreted as recovery interrupts by
virtue of the fact that the watchdog monitor has tripped. The redundancy management
process will then attempt a recovery operation.
In the ARCS recovery process there is no difference between resynchronization initiated by
the redundancy management function and initial sync as a result of a power-on interrupt.
If the local channel determines that another computer is operating normally, it waits for the
beginning of the next frame's computation, as signaled by a foreign sync indicator going from
clear to set, and enters the normal sync routine. It then enters the operational software at
the appropriate minor frame as part of the recovery operation.
During an initial startup, or during a recovery attempt, the local channel must first determine
if any other computers are operating. It does so by monitoring the sync indicators of the
other two channels for a period of time longer than one frame, as illustrated in figure 22. If
the local channel detects no activity by the other computers, it goes into a simplex mode of
operation. The initial sync process proceeds through the following steps:
• Mask timer interrupt.
• Set local sync indicator.
. • Test if other sync indicators are set.
• Wait for 1.1 frame times while testing other sync indicators to determine if there is any
activity.
• Clear local sync indicator.
• If local computer determines that it is the only one operating, it should:
— Enable timer interrupts.
— Start processing.
Otherwise it should:
— Wait and sync with an already operating channel.
— Enable timer interrupts.
— Copy state variable data.
— Start processing.
The normal synchronization process repeated in each frame includes the following steps:
• Check sync indicators for clear upon entry to routine.
• Set local sync indicator.
67
• Wait, alternately testing the two other sync indicators, until one of the other computers
is ready.
• If neither is ready before the time limit is exceeded, interpret this as loss of sync.
• After one of the other channels is detected, wait for last channel.
• If the time limit is exceeded for the third channel, mark it failed and continue.
• Clear local sync indicator.
• Test sync indicators to be sure that none failed in a set condition.
Execution of the synchronization function in software simplifies the hardware by eliminating
the necessity of interconnecting the three iteration timing references and voting on them in
each channel.
In this section we have identified primary elements of the ARCS redundancy management
processes and defined some of them on a functional level. In the following section we will
organize all the processes to be performed by the ARCS in a systematic manner so that the
visibility relative to requirements is maintained down through all levels of design.
5.1.2 FUNCTIONAL ORGANIZATION
We have chosen the format of a tree, as shown in figure 23 to subdivide the overall ARCS
process into progressively smaller processes that can ultimately be implemented as a hardware
or software algorithm.
The first level of breakdown separates real-time and non-real-time operations within the
ARCS (fig. 23a). Real-time operations imply that some constraint relative to the time avail-
able for function execution is a primary consideration in the requirements set. Real-time
tasks are all those ARCS processes that take place during the operation of the airplane. The
only non-real-time ARCS tasks are ground test operations for preflight verification of system
integrity and maintenance (fault identification and verification of maintenance actions).
Figure 23b illustrates the functional breakdown of the ground test operation.
68
8-1/2
 X
 11
 INCH
 CROPS
a=>
 i
-
o
 </>
a:
 LU
O
h
-
^
B
'
.QCOCMLUa:
z>o»—
 1
u
.
J
.
 
-aa:319
§
•CCJOc^
ii
69
8-1/2 X 11 INCH CROPS
PREFLIGHT/
MAINTENANCE
TEST ":'
INITIALIZE TEST DISPLAY
(b) Ground Test
Figure 23.—Continued
INTERRlPf
PROeiSSTNG
;70
8-1/ZX
 11
 INCH
 CROPS
CO"e»
3•j^c;
l
11
3-1/2
 X
 11
 INCH
 CROPS
•'
 
e
:
 
d
>
i
 iooCJQC
72
-
Preflight testing is conducted by the flightcrew as a part of the routine predispatch airplane
checkout. The purpose is to assess whether dispatch-required functions are operational. The
results of such a test will be a "go" or "no-go." The purpose of postflight or maintenance
testing is threefold: to test for and register the existence of hard failures that exist within
the system, to test for latent failure conditions that cannot be checked during on-line opera-
tions, and to verify system operational integrity following maintenance action. This testing
is provided for use by the ground maintenance operations, and hence the failure data thus
generated will be aimed at identifying the specific LRU in which a failure has been detected;
A transition from real-time operations to ground test will occur only when such a request
has been generated by the operator and the aircraft is stationary on the ground.
Since ground testing is essentially a non-real-time operation for the majority of the test
sequence, no attempt is made to maintain normal synchronization between the various
processors. Furthermore, computer subsystem testing involves RAM memory write/read
testing, which means that any state variable data previously stored in memory will be lost.
A computer recovery process is consequently initiated upon return of control to real-time
operations.
The second-level breakdown of the real-time operations reflects the digital implementation
of the ARCS and the necessity for processing some of the functions within the redundant
channels with a certain degree of time synchronism.
Synchronous tasks include all those functions that have to be performed in a parallel manner
in all channels to achieve efficient cross-channel consolidation at regular time intervals. These
functions are also referred to as foreground tasks. Asynchronous tasks, or background tasks,
are those tasks that, within certain limitations, can be performed on a time-available basis.
Interrupt processing tasks are those software functions triggered by events external to the
software. These events include iteration timing reference timeouts, I/O interrupts, hardware
faults, and power-on monitor trips.
Asynchronous real-time ARCS tasks are those monitoring functions, performed entirely
within each channel, that are necessary for establishing system integrity during aircraft
operation. In-flight (or on-line) testing of the computer subsystem is the primary function
performed by the asynchronous task. Its purpose is to detect and/or localize those failure
conditions that might not be detected by first-level system monitors. In-line monitoring is
provided both to enhance fault tolerance and to augment fault localization for maintenance
and repair. To achieve high second-failure coverage, a rapid means of localizing faults in
duplex operation is necessary. The in-flight test is a continuous, repetitive check of the
computer subsystem that keeps a running record of anomalies within the local computer
available as additional information to the redundancy management algorithms in resolving
conflicts in duplex configurations.
The design of the ARCS asynchronous operations follows the design requirements tree shown
in figure 23c. The three major asynchronous functions are on-line self-test, maintenance data
update, and system test panel (STP) processing. Because of the obvious benefit to system
maintenance of recording functional anomalies occurring during operation, asynchronous
tasks include algorithms to identify and store information for later use in establishing required
73
maintenance actions (maintenance data update). Algorithms necessary to provide interface
between the crew and the system through the STP (STP processing) also fall in the category
of asynchronous operations. A more complete description of the self-testing aspect of the
asynchronous task is contained in section 5.1.4.
The core of the fault-tolerant system, redundancy management, is contained in the syn-
chronous operations, which also include the application tasks of control law and mode con-
trol processing shown previously in figure 23a. The projected application tasks for the ARCS,
identified in section 4, consist of those control computations associated with flight-crucial
and flight-critical control functions:
• Flutter suppression
• Structural mode suppression
• Maneuver load alleviation
• Fly-by-wire control
• Stability augmentation
• All-weather autoland
Redundancy management, shown in figure 23d, breaks down into the four processes of
synchronizing the redundant channels, exchanging variable data between channels, substitut-
ing data from an unfaulted processor during recovery attempt, and revising the redundancy
configuration in response to power-on/power-off states and fault occurrences.
Reconfiguration processes to tolerate transient and permanent faults involve three major
stages of the system related to the location of the fault. Reconfiguration due to faults
located functionally upstream of the software sensor selection voting node are handled by
the SSFD. Faults in the processor will manifest themselves at the output monitor or watch-
dog monitor, and the appropriate reconfiguration process, described earlier, will be performed.
A special class of faults, namely those associated with a corrupted cross-channel data bus,
will cause output monitor trips despite correctly operating processors at each end. A cross-
channel bus integrity monitor provides failure localization information to the reconfiguration
process to handle this class of faults.
The third major class of faults includes those located in the servos. The reconfiguration
process for servos involves the monitoring and engagement/disengagement of the individual
servo loops.
In this and the previous section, we have introduced the general system reconfiguration
strategies and the overall ARCS functional organization. Before progressing with the detailed
software design description, we will develop in more detail the functional principles of two
processes within the reconfiguration domain; sensor signal selection and failure detection
(SSFD) and system self-test.
74
5.1.3 SENSOR SIGNAL SELECTION AND FAULT DETECTION (SSFD)
As implied by the name, the SSFD algorithm has two primary functional objectives: (1) to
extract the most useful data from the redundant set of signals and (2) to determine if any
input signal is operating outside normal tolerance limits and, if so, isolate it so it does not
influence the output from the algorithm.
The ARCS processors operate in frame-time synchronism. The signal selections performed
on the continuous and discrete sensor-inputs provide a consolidation point for the redundant
sensor data so that all processors operate on identical data and therefore perform identical
processes with identical results. Thus a simple fault-detection scenario is provided for the
stage downstream of the SSFD output, i.e., any discrepancy between channels indicates a
fault condition.
Sensors are not perfect, however. Each sensor type is afflicted with its particular set of
error characteristics: bias error, scale factor tolerances, dynamic response tolerance, and
noise. Such errors, if not compensated for or eliminated, must be tolerated by the failure
monitoring. This means that fault detection thresholds and detection delays must be large
enough to accommodate legitimate differences between the redundant signals, and the fault
detection capability becomes compromised.
To enhance fault detection, the ARCS SSFD algorithm therefore incorporates bias error
compensation for all sensor signals and scale factor compensation for certain sensor signals
where scale factor tolerances may have a significant effect on fault detection thresholds. An
example in this category would be bank angle, which normally operates near zero but
occasionally reaches up to 30° during maneuvering.
The reconfiguration function of the SSFD algorithm performs the same processes as those
described for the ARCS as a whole in section 5.1.1: fault monitoring and detection, fault
localization, temporary isolation, and permanent isolation of the fault. Fault monitoring
is based on comparison between the redundant signals for first and second fault. Channel
localization of first fault in triplex is automatically done through fault detection, but .
determination of fault location in duplex requires information derived from within each
channel. This determination uses sensor valids if available, reasonableness tests, or compari-
son with signals synthesized from state vector data available within the processor. Faulty
signals are isolated by moding the algorithm to ignore the faulty input.
The continuous-signal SSFD algorithm incorporating bias error compensation defined for
the ARCS is shown in block diagram form in figure 24. Each processor performs the proces-
sing depicted for each type of sensor. The total algorithm consists of four subpro cesses,
three of which are repeated for each of the three input signals.
First, the bias error calculated during the previous iteration is subtracted from the raw signal
to provide a compensated signal. The difference between the compensated signal and the
average from the previous iteration is then used to monitor for dynamic faults, i.e., a rapidly
deviating raw signal input. In order to tolerate legitimate dynamic differences under certain
operating conditions, dynamic fault detection may be suppressed by delaying the monitor
75
8-1/2 X 11 INCH CROPS
BIAS
ERROR
COMPENSATION
FAULT
DETECTION htAVERAGETALCULATIONC
PROCESSING SEQUENCE
LEGEND
ONE FRAME DELAY
BIAS ERROR
COMPENSATION
RECONFIGURATION
ALGORITHM
AVERAGE
CALCULATION
BIAS ERROR
CALCULATION
A SENSOR
RAW DATA
B SENSOR
RAW DATA*
C SENSOR
RAW DATA
Figure 24.—Continuous Signal Selection/Fault Detection Algorithm
16
trip for some sensors. The dynamic fault detection is part of the SSFD reconfiguration
algorithm. The output of the reconfiguration algorithm is a do-not-use flag against the
raw signal input.
After monitoring is completed for dynamic faults, the three (or two remaining after first
fault) compensated signals are averaged to derive the SSFD algorithm output.
The difference between the new average and the individual raw signal is input to the bias
error calculation to be used during the next iteration. The bias error compensation is pri-
marily a low-pass filter with the purpose of letting through only the static value of the
signal difference. A bias rate limit and a bias magnitude limit are included to protect against
a signal buildup should the raw signal fail actively. The bias error is used to monitor for
static fault. When the bias error reaches a predetermined level, fault is declared.
A trip of the static or dynamic fault detector is an input to the SSFD reconfiguration logic.
The first step will be to raise the do-not-use flag for the faulty signal. Even if the signal
returns below detection levels, this flag will be kept raised for a predetermined time period
TI, starting from the moment the signal returns, to prevent oscillatory failures from passing
through the selection process.
Should a second fault occur while the first do-not-use flag is raised, the first fault will
immediately be latched as a permanent fault.
The duplex-to-simplex logic will be configured according to other information available for
the particular sensor. For sensors with high-confidence internal monitoring, the valid
signal may be a sufficient data item on which to assume whether or not the sensor has
failed. For some sensor signals, functionally redundant information for comparison can be
synthesized from other sensors in the system or, more generally, from the system state
vector.
The discrete SSFD algorithm, shown in figure 25, is simpler since it operates on only two-
valued signals. Time skew between signals is the major complication,in providing a non-
ambiguous output from the algorithm.
Each of the inputs is compared with the algorithm output to determine if change of state
is required. The comparison output is majority voted, with the result delayed to suppress
transient anomalies caused by contact bounce and noise, before a change of state is executed.
Fault monitoring is performed on each of the input/output comparisons. Any persistent
disagreement indicates a fault and will cause the associated do-not-use flag to be raised.
Information derived by the SSFD that is pertinent to the maintenance operation is the identi-
fication of sensors that have been declared as permanently failed- The failure information
thus generated is used by the system test function to provide LRU-level failure identification
that will be stored in a nonvolatile section of memory for later use by line maintenance.
77
1-1/2
 X
 11
 INCH
 CROPS
LU
LU
 
UJ
'
t
L
U
(
-
13
 3
 LU
Z
Q
.
 1
-
i
CO
 
CO
 
CO
3
 
3
3
1
-
 13
k
—
 ID
 
1
—
 CO
O
 <
 
O
 <
 
O
 <
Z
-
/
Z
-
J
 
Z
_
l
L
>
.
^LUQ
O
 
0
 
O
 
'
Q
 0
 
1
It-
 
1
-
3
 a
e
 
:
<
 
L
U
 
«
U
-
 £
 
L
l
O
H
-
 
0
i
 i
a
n
 
o
 <
t
 
Ii—
j
 
_
i
3
0
:
 
3
 O
£
£
 L
U
 
<
 L
U
L
 s:
 
u
-s
:
H
I
-
 
<
t
-
4
 
,
 
'
 
'
L
COCO<
L
_
P
^
ctO
 LU
-3
h
-
S
>
t
 
k
 
J
0~<
 
oCQ
CO<
 
°
1
 
1
r
*
 
—
 J
O
 LL
 CQ
•
•
•
 
^
Q
Z
>
^
—
 ^
^^
IH
I^
^
^LL
.
^
 
C
OZ
.
 
i
 
L
 
0Q
U
l
0
 
h
-
*"Ws
 
ui
O
 
O
t
uCOQ
'
 
'
 
h
-
»
 
Q
O
LU
 
O
 
LU
CC
 
t
-
 
H
<
f
-
 
-
 
O
 3
O
.
 3
 C
O
 I
 
U
l
 O
.
y
 C
L
 
H
"
 *J
 fr™
O
 Z
 
»
 •
—
 L
U
 3
O
 
•
—
 <
 Z
 
C
O
.O
CL3O
jJsi oj ^1IIi o' &!<§! "^"^1 QJi C.ili.Oj *5j|; ^_j.licoi Q: 2ftoQ1id. *N0), W^
O
TTT
LU
 
UJ
 
LU
(
-
 
I
-
 
H
LU
 
U
l
 
LU
a:
 
a:
 
a:
o
 
o
 
o
CO
 
CO
 
CO
a
 
a
 
a
<
 
a
a
 u
78
5.1.4 SYSTEM SELF-TEST
Second-level monitoring, performed on-line during in-flight operation to provide fault informa-
tion for the redundancy management function to augment fault localization after fault
detection by a first-level monitor, consists of software routines and hardware monitors. This
monitoring is concerned with the integrity of the processor, memory, and input/output inter-
faces.
Processor self-test is the primary on-line software self-test for the computer unit. Although
it runs in the background mode, it must be completed repetitively as determined by the
time deadline, i.e., a maximum time is specified between sequential completions of the test.
The computer self-test begins with an instruction test sequence within which all instructions
are exercised, all registers are involved, and all addressing modes are used.
Endless loop instructions are included at points in the program where an improper computation
might mislead the processor. If the processor performs an erroneous computation and goes
into one of these endless loops, the time deadline for the computer self-test will not be met.
The second test to be performed in the background mode is a program memory sum check,
wherein all the words in the program memory are summed and compared with the known
correct sum. This test is provided for verification of the overall integrity of the program or
constant memory, but it could conceivably be partitioned by control axis or mode for
improved fault localization if such resolution is possible with the program architecture.
The third test in the computer self-test sequence is a scratch-pad read-write test. A number
of locations in the scratch pad are dedicated to self-testing. On successive iterations of the
test, random patterns are written into these dedicated locations and then checked. This test
is designed to test memory integrity and addressing structure throughout the scratch pad.
Proper operation of the computer input and output sections will be tested using wraparound
loop checks of both analog and discrete data. Analog I/O will be verified by comparing a
special test output with the same output data that has been looped back into the computer
through the multiplexed A/D. Because each of the servo output commands are expected to
remain approximately equal to zero, massive "stuck-at" faults of the multiplexed portions
of the analog I/O will be detected by varying the value of the special test value with each
pass. Operation of the discrete I/O will be similarly checked by one dedicated test discrete
output that will be looped into a dedicated input channel. This discrete value will alternate
between a 1 and 0 state, again to detect "stuck-at" faults of the multiplexed elements of the
discrete I/O.
Each processor can operate a loop test through the interchannel data link by transmitting a
sequence of known random bit patterns to the other processors as a part of the normal cross-
channel data transmission. The other processors must return the pattern unchanged to the
originating processor. This test is used to provide integrity checking of the interchannel data
links.
79
When a failure is detected during the in-flight test, the failure condition will be recorded in a
temporary storage register. When the failure is recorded, control is returned to the test
program to continue testing. In this way the entire test sequence can be completed and all
failure information accumulated before any update is made to the maintenance data.
All failure data generated by the on-line self-testing function is stored in the system status
table as well as the maintenance data table. This information is then available to the redun-
dancy management processes for fault localization purposes.
A further aspect of system self-test is the ground test operation whose purpose is to verify
total system operational integrity for preflight verification or following system maintenance.
Any system failure conditions detected during ground test will be analyzed and the data
formatted for display on the STP, with the particular display depending on the failure
condition and the test mode selected.
The procedure of ground testing, whether it be for preflight or maintenance purposes, makes
use of a so-called "center out" test philosophy, wherein the most basic elements are tested
first and then these basic elements are used to test other functions. This is illustrated by the
ground test functional requirements free in figure 26. At the highest level, this test structur-
ing results in an organization that tests the computer LRU first.then the input/output elec-
tronics sensors, and finally the servo systems. The computer testing segment is further broken
down into its constituent parts, i.e., central processor element, computer hardware monitors,
and RAM and ROM memories.
Following a diagnostic test of the processor as previously described for on-line self-testing,
the ability of each of the hardware monitors to detect and annunciate the existence of a
fault condition is verified. The ability of the watchdog monitor to indicate a "good" state
and a failed state will first be verified. Next, the ability of the arithmetic error detector to
detect an overflow condition and generate the corresponding interrupt will be verified.
A parity generator inversion discrete, controllable from within the processor, is provided in
the ARCS. Setting this discrete will cause the words written into memory to be stored with
even parity. When an attempt is made to read this data out of memory, a parity error inter-
rupt should be generated. Where separate parity error detectors are provided for different
sections of memory, this test will be repeated for each parity detector in the system.
The ROM or constant memory will be tested using sum checking techniques as described
for on-line self-testing. The operational integrity of the write and read aspects of RAM
memory will be verified by use of a predetermined data pattern. By comparing the pattern
readout with the pattern that should have been written in, failures within the RAM memory
can be detected. In addition to comparing data patterns, every write and every read cycle
passes the data through the hardware parity generator and parity checker, respectively.
In the next step, a test will be conducted to verify the operational integrity of the computer's
output and input electronics. The ARCS I/O can be roughly divided into three categories:
analog, discrete, and digital. The approach used is what is commonly referred to as wrap-
80-
i
 
*
•*
 i
 i
 
r\
 
r\r\r\f
+
CO+j••«I5PCM
81
around testing, whereby a specific output is commanded by the processor and the resultant
output looped back into the corresponding input processor. The received data can then be
compared with the transmitted data to check both input and output functions.
For analog data, a single wraparound test path is sufficient to verify all the common or
multiplexed elements of the input and output electronics. However, such a test does not
check a significant amount of electronics downstream of the demultiplexer on the output
(namely, individual sample and holds) and upstream of the multiplexed A/D (such as individ-
ual buffers, demodulators, and filters). These elements are typically dedicated to specific
inputs and outputs. The proposed baseline ARCS hardware includes the capability to switch,
under software control, all the analog output commands into all the analog A/D input chan-
nels. In this way all I/O channels can be tested over their full range as a part of the ground
test routines.
A similar capability has been provided for discrete data by switching all discrete outputs into
all discrete input channels; digital I/O data uses a hardware loop from the cross-channel data
transmitter to receivers. All such loop testing is accomplished under software control.
In addition to the intrachannel digital I/O loop check, each processor can operate an inter-
channel loop test through the interchannel data link. A sequence of known bit patterns is
transmitted to the other processors, which must return the pattern unchanged to the
originating processor. This test serves two purposes: to provide an integrity check of the
interchannel data link and to assess the operational redundancy level of the computer
subsystem.
To verify sensor system functional capability requires that at least its output be stimulated to
produce some expected value. To do this without using ground support equipment (GSE)
implies that, wherever possible, a sensor's own internal self-test be used to provide this
output. Based on the expressed desirability of simplicity, speed, and one-man operation,
. automatic stimulation of sensor self-test functions will be provided wherever possible.
I Where automatic stimulation of self-test features is not possible, the interactive approach,
1
 wherein the test operator is reqested via the STP to provide the required stimulus, is used
to maximize test effectiveness.
• The servoactuator is the third major functional block to be checked out during ground test.
A functional breakdown of those tasks required to check out the operation of any given
channel of servoactuation includes verification of the engagement/disengagement function,
dynamic response testing, and verification of the force voting feature. The testing tasks/
procedures must be repeated for every axis of actuation.
5.2 SOFTWARE DESIGN
The functional concepts described in the previous section were translated into a software
design using the methodology described below. The ARCS software design is further illus-
trated in appendix B down to a level relevant for the understanding of the fault-tolerant
operation of the baseline ARCS, i.e., the application software is described only to a level
required for understanding its overall place in the complete design.
82
Reliable software is an obvious requirement for a fault-tolerant computer system. Reliable
i software requires a strict software design methodology. The software design methodology
adopted for the ARCS design process recognizes that the single most important factor in
achieving reliable software is to maintain design visibility on all levels throughout the design.
i This visibility is necessary to ensure that all requirements are being correctly interpreted and
I met, as well as to provide continuity across interfaces between personnel involved in the
development, testing, certification, and continuous operational maintenance of the system.
I The software design process has been set up as a sequence of steps. In practice, each step
may be repeated several times. Each step includes appropriate documentation. The steps
! are as follows.
1. Perform the top-down functional identification of the system design and generate the
associated process function tree.
2. Draw the software structure definition tree.
3. Enumerate all modules of this tree and compile the data-space elements for each module.
4. Starting with the module for level 1, construct the intramodule transition diagram for
: each module of the system.
5. Generate the software code from the transition diagram in a top-down manner.
|
I Details of these design steps are presented below.
i
5.2.1 TOP-DOWN DESIGN (STEP 1)
Top-down design is the process of systematically defining processes that satisfy the require-
ments of the desired system. The output of this step is a description of each process that was
defined and the process function tree. This tree is simply a drawing that shows the top-down
relationship of the functions, and it will be used in step 2 to generate the software structure
definition tree. A typical process function tree is shown in figure 27.
5.2.2 SOFTWARE DESIGN TREE (STEP 2)
The requirements tree of step 1 is used to generate the software structure definition tree.
Identical to the function tree, this tree is used to provide short names for the functions and
to define the modules of the system. A typical tree and its modules are shown in figure 28.
A module is a collection of related processes. The processes in a module share a common
data space that is defined for the module. Figure 29 shows a module and the-relationship
with its submodules. A module is typically only a link to its submodules and as such does
not perform any computation. The lowest level processes are the ones that actually do the
computations.
83
-1/2 X 11 INCH CROPS
Figure 27.—Process Function Tree
MODULES
Figure 28.—Typical Tree and Its Modules
84
rSUBMODULES
MODULE
Figure 29.—Software Module
In step 4, a transition diagram for each module will be constructed. The transition diagram
shows intramodule control transfer. Control transfer from a module to its submodule is to
the single submodule with a transition in the transition diagram that does not have an origin
state. The transition back to the supramodule is defined by the module in the transition
diagram that has a transition with no destination state.
Note that in such a tree as shown in figure 28, a module will take the structured name of the
supramodule. This notation will facilitate discussing the design and relating the design to
the software.
5.2.3 MODULE IDENTIFICATION (STEP 3)
In this step a table of data-space elements is defined for each module. To aid in defining
these data-space elements, one should consider the input and output requirements of the
submodules. These elements are the data to describe the state of the system, tables to
record features of the execution, arrays of data, etc. The entries in the table have the form
<short name>, which is a description of the data item such as dimension of the array, the
meaning of the data item, etc.
The data space for a module is defined by the set
n n
O,
85
where Is, Os, and Oq are the inputs and outputs for the processes of the states shown in
figure 29. The data interface between the module S and the supramodule in which S is a
submodule is given by the set Is W Os-
This initial definition of the variables will serve as a starting point when the control aspect
of each module is addressed in the next step. At this point the only control that has been
mentioned is the passing of control from a module to its submodules. The intramodule
control is described in the next step.
5.2.4 TRANSITION DIAGRAM (STEP 4)
The transfer of control between states in a module can be represented either with flow charts
or by transition diagrams. The transition diagram is used to show the flow of control in
this methodology because it is felt that it provides better visibility into how the process
relates to evenis in time.
Since the intent is to generate a design that is amenable to structured programming, the
transition diagrams will be drawn using a limited set of single-entry/single-exit constructs.
Figure 30 shows the set of constructs.
The states that are the submodules in the module are the processes that must be related by
some control structure. These control paths are shown by the transitions in the diagram.
The transitions are drawn by considering the states in the module two at a time to determine
if a transition should occur between the states.
When it is determined that a transition should occur, a condition is defined if needed. This
condition is recorded in a table of conditions for this module. The inputs that caused this
condition are then determined. These inputs will consist of data items that are recorded
in the data-space table for the module. This table's definition was started in step 3. As new
entries are determined, they are placed in the table.
A state in a transition diagram represents a process that will be implemented in software.
Figure 31 shows a state S and defines (1) inputs and outputs of the process represented by S,
(2) conditions that cause transition from a state, and (3) data-space elements used to compute
the condition.
.86
8-1/2 X 11 INCH CROPS
O—-Q CONCATENATION
LOOP
CASE
Figure 30.—Software Constructs
87
8-1/2 X 11 I N C H CROPS
Oc
Input to the process PS
Output from the process PS
Inputs to or outputs from a state
Condition that causes transition i to be traversed. If a minus sign
precedes Cj, then the condition is the complement of Cj.
The inputs that caused Cj to be true
The outputs that were computed from Lci
Represents the transitions into or out of a state
The state S that defines a process Ps such that
88
Note that the lc. are subsets of ls and the Oc. are subsets of Os.
, Figure 31.—State Definitions
5.2.5 SOFTWARE CODE (STEP 5)
To simplify the coding, a higher level language could be used that supports the constructs
and represents a suitable method for handling data structures at the bit, byte, and word level.
Each module can be considered as a.single-entry/single-exit block of code. The control
structure of the module is coded by using the defined constructs. It is the programmer's
responsibility to determine how each module should be implemented. There are two alter-
natives. One is to use a subroutine; the other is to use "in-line code."
Documentation must proceed with the design and in all cases ultimately describe the software
that implements the design. The design is described by documenting each module of the
software structure tree. The outline for the module documentation is as follows (see app. B
for illustration).
Module: STRUCTURED NAME: Descriptive Name
a) Description: Paragraph that describes the module's function
and references to the requirements, if applicable.
b) Sub processes: A list of the submodules in the module by
short name and descriptive name.
c) Inputs and outputs of the A table of inputs and outputs for each
subprocesses: submodule.
d) Data-space elements Described by giving a short name and a
description of the element.
e) Control flow diagram: Described by using transition diagrams or flow
charts.
0 Condition table: Table that defines the conditions used in
specifying the control flow.
5.3 HARDWARE DESIGN
The ARCS concept emphasizes software processes in achieving fault-tolerant capabilities.
The ARCS candidate hardware must be viewed as a vehicle to facilitate an effective software
design for the overall reconfiguration and application processes described or identified in the
previous sections.
The hardware description is divided into the hardware system architecture, the hardware
interfaces, and the computer unit discussed below. The instruction list is presented in
appendix C. A detailed discussion of the hardware configuration rationales and trade studies
is contained in appendix D.
89
5.3.1 HARDWARE SYSTEM ARCHITECTURE
The term "architecture" as used here refers to these aspects of the candidate ARCS hardware:
the interconnection of the major hardware elements within the system, the functional organi-
zation of the computer unit, and the overall redundancy management/reconfiguration
structure. Figures 32, 33, and 34 contain block diagrams corresponding to each of these
architectural aspects. These diagrams provide the foundation for the discussion of system
architecture that follows.
The interconnection of the candidate ARCS hardware is shown in figure 32. The baseline
structure is a triplex system with a single computer unit per channel. Each computer unit
contains a processor, memory, and all channel interface electronics. Sensor, mode control,
and servo interfaces are dedicated on a channel basis, with data exchanged between computers
via cross-channel data buses. Each computer exclusively controls the engagement and shut-
down on its own servos.
All cross-channel communication (excluding frame sync discretes) is accomplished via
dedicated one-way optical serial digital data buses that independently interconnect each
computer to each other computer. A functionally separate utility interface with the system
test panel is provided on a per-channel basis. This is a serial digital interface per EIA
Standard RS-232C. The interfaces required for an expansion to a quadruplex redundancy
level are shown by the broken lines in figure 32.
A functional block diagram of the ARCS computer unit is illustrated in figure 33. Key
features include the following:
• Architecture is highly bus organized.
• Flexible computer I/O system uses the "directly operable input/output" (DOIO) concept,
which:
— Provides direct memory access (DMA) input/output capability as needed by each
interface element, without interference with CPU operations
— Provides interrupt-initiated or processor-controlled input/output using a standard
device-controller structure
— Allows the CPU to access and operate on all I/O data directly within the RAM
memory structure
• Solid-state memory is partitioned into functionally independent program memory
(PROM) and variable/scratch-pad memory (RAM) sections.
The redundancy management block diagram for each channel of the ARCS is shown in
figure 34. Key features of the reconfiguration design indicated by this diagram (in conjunc-
tion with the previous two diagrams) are the following:
90
8-1/2 A 11 INUH UKUKS
SENSORS
&
MODE CONTROLS
A
SENSORS
&
MODE CONTROLS
B
SENSORS
O.
MODE CONTROLS
C
1 1
1 1
1 i
^ ' ' -
r ~ 1
! (QUAD |
, EXPANSION) L.
SYSTEM
TEST
PANEL
«•
1 1
1 '
1 1
1 1
1 1
1 . _1
 1
1
1
 11 I
1 1
1 '
1 1 *
1 11
 1 ^i i
.-!•
M J
r~ i '
! i :
— 1
|\
ARCS
COMPU
UNIT
i
j
i
TER
A
ARCS
COMPUTER
UNIT
j
ARCS
COMPL
UNIT
(QUA
EXPA
1
*—- '- '
ii,
D
N
1
1
_J
B
ITER
C
1
SION) i
1
ecownc
A
•* —
—
B
1
| |
i 1 ccp\/ri<s
, 1 c
I I
1 1
• [ 1
_ ' j. _ JQUAD "1
i ' EXPANSION]
--J ' L._ — JS
i Figure 32.-ARCS Hardware Interconnection Block Diagram
91
8-1/2
 X
 11
 INCH
 CROPS
B?
 i
L|
 
L|
 
x
-\
60
 0
 
S
O
 E
 
o
cu
 is
 
S
-
11
Llo
•
-I
 CO
CO
 CO
Li
 0)
 
/-v
C
 
O
 
-H
 (X
,
o)
 M
 c
 
u
O
P
U
 
3
 
^11
•oCOafTo
 >
»
4J
 
L|
 
-^N
S
i
 
s
/
-
-
-,ii
«CO'3pa.-,B
oCOPQ
'.
_
,
P-,
O
I1111
 
,
11I
"
"
111I
 
r
1
s
 :
 
J
.5
 
t
—
0
 
'
s
 
•
c
 
'
o
 !
-{s
t
 
—
H
 i
*^
*
»
 
1
M
 
-111
 
*
"
||l.^
.
 
L
m
 
^
1'|1
0)
i-(
 
CO
 
/~\
1
 1
 i
.
 
^J
•HO)
 
ca
60
 4J
 
«
^
•H
 
C
 
>
O
 
M
 
>
-
<
<U
U
 (J
_
 
CO
 CO
E
 O
 tw
0)
 
-^
 
L
i
•U
 
4J
 0)
CO
 
CO
 
4J
co
 
H
M01u
f-<
 
C
O
 
^N
1
 
C
U
 
*H
 
S
W
 
C
 
L
I
 
2
-
w
 C
 
<
u
 
pi
O
 
C
O
 4J
 
^v
,
M
 43
 C
 
S
O
 O
 M
 
<
-<
<W
C
 
<U
o
 a!
 
60
••-1
 
O
 
Li
4J
 
60
 
TJ
 
O
CO
 
C
 
<#
 43
 
4J
L|
 
.H
 
C
J
 
-rl
*J
 
-H
 
C
O
 0
M
 H
 
^
 S
0)
0)
 
U
4J
 
CO
(U
 
«H
M
iJ
H
 *
0)O<Q
 
x
^
0
 
D
 M
 
<
^
 
0
4
 
<
U
 
Q
^
tj
 
^
J
 
^
J
 
^
^
0)
 D
 q
 
s
CO
 O
 
M
 
'
 
^
>
0)uCO
 
^N
O
 
4J
 
L|
 
<C
t-l
 
3
 
<
U
 
K
CO
 
O
.
 
*J
 
~
>
.
C
 
C
 
C
 
15
<
 
M
 M
 
O
=
>
/—
I
V- 1
=>=>a—
 V
—
 /
aA-/^
5
°
§CJ^^*oQ^1CILl_3ir
92
-i/i
 A
 11
 
m
u
n
REDUNDANCY
MANAGEMENT
.1QQCDI8
'93
• Triplex architecture is based on two voting nodes, one at the level of sensor data and the
other at the force-summing point connecting servoactuator outputs.
• With appropriate operating software, the system tolerates all first failures, a high per-
centage of second like failures, and multiple numbers of transient faults at each voting
node.
• Frame-synchronous computation is achieved through a software sync routine using
cross-channel discretes.
• Iteration timing reference provides real-time interrupt and, in conjunction with the
operation of the sync routine and the watchdog monitor, also initiates recovery
operations.
• Software servo shutdown decisions are backed up with two hardware monitors in each
computer unit: the watchdog monitor and the dual servo-loop electronics monitor (one
per servo).
• Dual servo-loop electronics are used to provide servo self-monitoring capability for all
faults not detectable by cross-channel comparison.
• The watchdog monitor detects gross computer failures that cause loss of real-time
control.
• All cross-channel comparisons of computer outputs and servo differential errors are
accomplished in software.
• All cross-channel data exchange is initiated by the sender.
5.3.2 HARDWARE SYSTEM INTERFACES
The ARCS hardware utilizes a software sync routine to establish and maintain frame-
synchronous computations in all three computer channels. To facilitate this type of operation,
each channel issues a local sync command (LSC) discrete that has these functions:
• When SET, it informs the other channels that the local channel is ready to sync.
• When CLEARED, it causes the iteration timing reference counter to reset and begin
counting out the next iteration period.
• When not SET and CLEARED in a real-time period that falls within a time interval
tolerance, it causes the watchdog monitor to indicate computer failure.
As indicated by figure 35, the LSC discretes are cross-channel interchanged as bits in each
computer's status register. The iteration timing reference generates a priority interrupt that
defines the real-time frame rate. If the LSC fails to satisfy the requirements of the watchdog
monitor, then a computer failure is indicated, and the next interrupt from the iteration
timing reference is interpreted as a recovery interrupt by the software. The integration of the
94
8-1/2 X 11 I N C H CROPS
ITERATION/RECOVERY INTERRUPT
ITERATION
TIMING
REFERENCES!
WATCHDOG
MONITOR
LOCAL
CHANNEL
LOCAL
SYNC
COMMAND
FOREIGN
CHANNELS
IDENTICAL
TO LOCAL
COMPUTER
UNIT
SR
SR3STATUS
REGISTER CONTAINING
LOGIC INDICATORS LI,
L2 AND L3
FOREIGN
'SYNC
COMMANDS
Figure 35.—Interface to Allow Software Frame Synchronization
:95
iteration and recovery interrupts into a single interrupt minimizes the additional hardware
that must be provided to facilitate recovery within ARCS. In addition, it provides a regular
exercise, and therefore validation, of nearly all of the electronics involved in initiating
recovery from outside the processor.
The ARCS uses cross-channel data exchange to achieve many of the fault-tolerant characteris-
tics. A high-speed, one-way, serial, digital, optical data link allows the following data trans-
fers from each computer to any other computer:
• Sensor data for sensor selection and failure monitoring
• Output commands for failure monitoring
• Servo data for failure monitoring
• Recovery data
• Failure status and maintenance data
The use of an optical data link ensures complete electrical isolation between computers.
Figure 36 illustrates the basic data link structure between any two channels. Each trans-
mitter operates in the processor-controlled DOIO mode. Data is sent in bursts of up to 64
words each. A total of 1024 words of dedicated RAM storage is provided at each receiver.
The receivers operate in the DMA mode of the DOIO system. Transmissions are at a 2-MHz
bit rate using a self-clocking signal format from MIL-STD-1553.
The triplex force-summed servoactuator design for ARCS combines experience from two
fly-by-wire systems by using:
• Low-pressure-gain servovalve/servoactuator modules and cross-channel monitoring
concepts from the 680J secondary actuator (Survivable Flight Control System
Development).
• Self-monitored, independent channel concepts from the HLH-DELS Program.
The selected design has the distinct advantage of not requiring differential pressure feedback
equalization. Figure 37 shows a block diagram of the servoelectronics/servoactuator concept.
96
8-1/2X
11
 INCH
 CROPS
V
ooPQ0)ccCOXu
PQcoPPQDPL,
CJ ^oCOPPQ£>P-
.
0
^
~
I
-
l
co
 
q
w
 
c
 i-i
O
 
CQ
 4J
t__i
 
r
*
 
C
*
"
*
 
r
*
-l
 
C
o
 o
 £
CO
i
—
 i
 
(ii
1
 
C
U
 
T3
 
-^
w
e
 
p
S
CO
 
C
 
M
 
r-4
 
<C
O
 
«0
 
>
 
O
 pd
}_l
 
rj
 
o
 
C
O
 U
 p
i
 
-i-l
v
.^
n
 
[1
0)
.1w&o
PQCO3PQ£3CUO
OcopPQPcx,O
I
 
V
co
 
C
O
 
to
 
4J
o
 o
 S
ICOCOO
CO
i-l
 
C
U
<U
 
TJ
C
 
P
§
">
 
O
 pd
V4
 
£
 
O
 
C
O
 O
 P
i
 
-H
1I,c>J2I
97
8-1/2
 X
 11
 INCH
 CROPS
Is03i
•
 98
Since each servo uses dual servo-loop electronics for the purpose of self-monitoring, each
computer must generate two D/A output commands per servo. In addition, a servo engage/
shutoff discrete is required for each servo. The computer software control of this discrete
includes the capability to automatically reset and reengage a bypassed servo. Figure 37
outlines this operation. Notice that automatic reengagement is interlocked with the output
of the watchdog monitor. This monitor must indicate that the computer is not failed before
an automatic reset and reengage can occur. On the other hand, either the computer-
generated servo engage/shutoff discrete or the watchdog monitor backup shutoff discrete
can cause servo bypass.
The servos are monitored by two different techniques:
• Comparison of servovalve dual-coil currents (differential current) within the dual
servo-loop electronics
• Cross-channel comparison of servovalve (sum) coil current (which is proportional to
differential pressure) using computer software
The first technique is used to detect failures within the servo-loop electronics and the
position feedback sensors. The second technique is used to detect both servo and computer
command failures that are not detected by other monitoring functions. In addition, pressure
switches are used to detect gross pressure differentials (hardover errors) or loss of hydraulics.
This combination of servo monitoring techniques makes it possible to achieve two-fail-
operational performance for a large proportion of second failures.
The D/A outputs from the processor are generated in the processor-controlled DOIO mode
using what appears as a dedicated D/A per output. In effect, this means that no analog
sample-holds limitations apply. Individual outputs may be "frozen" or updated at different
rates as determined by the software. Each D/A output appears as one RAM memory location
to the processor.
Within the ARCS hardware architecture, input sensors and mode controls are dedicated on a
channel basis. Thus, there is no cross-strapping of sensor data prior to "the analog-to-digital
and digital-to-digital interfaces. All cross-channel exchange of sensor data is accomplished via
the cross-channel data link.
The control of sensor input conversions occurs within the DOIO structure. For analog inputs,
a multiplexed A/D converter begins its sample sequence following the iteration interrupt.
Inputs are converted and stored in RAM memory in a DMA mode.
Because of the DOIO structure, RAM memory for DMA operations is accessed with no
interruption of CPU activity. Once stored in RAM, the input data is immediately available
for use by the program. Double buffering of analog data for (bit-identical) sensor selection
purposes is not required with appropriate executive scheduling of computations.
Serial digital inputs, from asynchronous external devices, are received using dedicated
standard interface receivers with one receiver per data source. Input data is stored in RAM
99
using the DOIO DMA mode. A fixed block of RAM storage is assigned to each serial digital
receiver. Double buffering for (bit-identical) sensor selection is required.
Serial digital outputs to asynchronous external devices (excluding the RS-232C system test
interface) are provided using dedicated standard interface transmitters. One transmitter
is used for each different output standard. Each transmitter operates in a processor-controlled
mode and, within DOIO, appears as one RAM memory location to the processor.
Discrete I/O is handled in a DMA mode for inputs and a processor-controlled mode for outputs.
Discretes are packed within the 16-bit data word format and, with the bit manipulation
instructions, are easily operated on by the program.
5.3.3 ARCS COMPUTER UNIT
The functional block diagram of the ARCS computer unit was presented in figure 33.
The major functional elements shown ii\ the block diagram are the following:
• Central processor unit
• Program memory
• Scratch-pad (variable) memory
• Analog input interface
• Servo output interface
• Discrete interface
• Iteration timing reference and watchdog monitor
• Cross-channel interface
• System test/GSE interface
• Digital interface
Each of these functional elements is described below, along with power supply and test and
monitoring functions.
5.3.3.1 CPU and Memory
Table 2 summarizes the functional characteristics of the processor.
The CPU interfaces with and controls the program memory, the scratch-pad memory, and
all the interface elements within the directly operable input/output (DOIO) system. The
CPU receives instructions from program memory, interprets them, and performs the indicated
arithmetic, logical, branching, or control operation. The process of memory access and
100
^
 0
££3
2
w
>
 
w
 
o
5
 5
 c
 s
U
J
 U
l
 C
 
<
x
U
*™
 
u
_
H
I
 ™
lls
lS
a
?
 i
 
<
 
ic
o
S
n
 
n
 
n
 
ii
 
u
 
u
D<
 m
cc
 cc
£
 
W
 W
£
E
 <
r™
 t^T
 t^r
 111
 ff
/f
 
O
C
 
<
 
to
E
l
 3
.
"
U
.
OC
CCUl
CC
 
h
-
Q
 oc
 
oo
<
 
w
 <
 D
 
_i
 et
u
 
n
 
n
 
u
 
n
 
ii
«U)
 N
2?J
.1i!
101
Table 2.—Functional Characteristics Summary of ARCS Processor
Item Characteristics
Type
Number system
Data word length
Instruction word length
Register structure
Instructions
Throughput
Address modes
Interrupts
Memory structure
Input/output structure
General purpose, stored program, uniprocessor
Binary, fixed point, 2's complement, fractional
16 bits standard, 32 bits double-precision
16 bits
Accumulator organized with three index registers
Microprogrammed set of 104 with application-dependent
spares for special op-codes
420 kops (85% add, 10% multiply, and 5% divide)
Direct, indirect, program counter and index register
relative, and immediate
Eight level, software maskable
Independent program and variable memories
DOIO concept: allows interrupt-controlled,
processor-controlled, or noninterference
DMA-controlled input/output
instruction execution is controlled by a stored microprogram within the CPU. Because
of this microprogrammable design, the CPU may be conveniently divided into two distinct
functional elements: the microprogram control structure and the register structure. It
is the microprogram control structure that maps each machine op-code into particular
activity within the register structure.
The register structure shown in figure 38 is a three-bus system centered about the arithmetic
element. All buses are 16 bits in width. The A and B buses are tristate so that multiple
sources may be enabled onto them. The B and C buses provide the communication links
for address and data for program memory, scratch-pad memory, and input/output devices.
The A bus is resident within the CPU register structure.
The instruction set is tailored for real-time control applications. It has a full complement
of load/store, arithmetic double precision, logical, branching, register, input/output, and
shifting instruction that have proven useful in control applications. The instruction set
is an expansion of the General Electric MCP-701 instruction set, particularly in the area
of immediate instructions, bit manipulation instructions, interrupt handling instructions,
and the scratch-pad memory reference instructions that allow direct manipulation of
input/output data. In addition, the processor has reserved 28 instructions that may be
microprogrammed to suit the application. All instruction execution times are based on
a 1.0-us cycle time program memory. The detailed instruction-by-instruction description
is presented in appendix C.
102
The CPU has an eight-level priority interrupt system. The highest level is not maskable
whereas the seven lower levels are maskable using the seven most significant bits of the
status register. All interrupts are set by external stimuli and reset by the software. The priority
of the seven maskable interrupts is established by a linear daisy chain. This priority is
alterable under software control by using the mask bits of the status register. The processor
has four machine language instructions specifically designed to minimize the overhead
burden associated with interrupt processing.
The program memory is separate and independent from the variable scratch-pad memory.
Interfacing between a memory module and the CPU is by way of the B and C buses.
The CPU supplies control lines to all memory modules. A memory module is active only
if the address supplied from the CPU enables it. The program memory is composed of two
sections:
• PROM (fusable link) memory
• Nonvolatile storage zone
Because of the flight-critical nature of fly-by-wire applications, the memory configuration
provides for maximum software control and the best resistance to transient conditions
that may affect destructive readout (DRO) ^ configurations. It employs semiconductor
read-only or programmable read-only memories for program store.
A nonvolatile storage zone is provided to record and hold a historical record of fault
occurrences for engineering and/or shop maintenance purposes. This portion of program
memory is therefore associated exclusively with the maintenance aspects of the ARCS
system test function.
The processor has semiconductor scratch-pad random access memory for intermediate
variable storage and bulk variable storage. It has its own memory address register and
enable and control logic. It is controlled by the CPU independently from the program
memory.
5.3.3.2 Directly Operable Input/Output (DOIO)
Input/output devices are imbedded within the scratch-pad memory addressing structure.
The data from the input/output device may be directly operated on rather than first
transferring it to main-line variable memory. DOIO treats input/output data as distributed
memory locations. Each distributed memory section is a functionally independent memory.
Each additional input/output device contains its own scratch-pad area.
Three types of input/output sequences are permitted:
• DVC—device-controller interrupt initiated
• DMA—direct memory access to the distributed scratch-pad
• PC—processor controlled
103
1__ I CONVERSION^ I INITIATEA/D CONV.CONTROL i
 
8
!
 
EACREF.GENE1ATOR
SZr
*
0
llou
>!
I
 
?
§|
 3
8
DOIO .
INTERFACE
i32 x
READ/WRITE
MEMORY
4sff
u
 
If
 
*
II
 
-!!i
 
I
H
 
1
II
 
0
u
 
§
H
 
i!
1
 II
 
^
a !
 
r
—
x
 
H
 
E
 
,
-
|
 ll----
2 |s
—
 J
2
 
II
 
2
S
S
 *
 
'
 
.
 
•
 
|
i
 
II
 
T
L
~
 
'
 
''
 
V
II
 
|,,
-
 
,
II
u
"X
 
f
II
 
B
 
1
II
 
t
 I
in
 
!|i
 
,
I
 II
2
 
II
1
 
<
-
 
"
 
^
 
2
 
•
•
T
 
II
 
-^
 
•
 o
 
o
 
.no
 
o
 
•
/z
\
 
II
 
o
o
 
"O
 
)<-*-7
 
o
=
/
 
<
5
\
 
"
 
a
a
 
-
»
o
 
'b-^J
 
0
2
^
«
S
§
\
 
II
 
"5
 
_
-
:
 
ft
(
S
v
£
!
>
 
o
h
^
 a
;
•
^V/->
i
 i
 
•
 
-
^
 
.
 
n
II
 
7
-^
°
 
9
 9
 
°
*
"
i
 
1
 
ii
 
i
 
••
 
i
i
 
II
 
'
 
•
•
•
 
41
 
"
'
J
 
u
 
g
-
.Is
 
|
 
to
 
[HH
H
 
fiH
i
S4
-
3
 
x
 
u
 
,i
i
 
u
 
V
c
.
.gi
 
"
 
ii
iig'i
 
H
'«!
 
.
.
T
 T
 
'i
 |!
s'a
 
4
 
3
0?
 
H
S.9S
 
»
sl
 
Jit
 
||
:io
 
1
•a
 
5-
 1
i
 '
II
.
•
 
oII1
.
_
 
—
 I
68
 
1
'
 
z
 
•
—
 
'
 
Q)
X
.
 
"
*^
in
 
:
 
*^
»
E
 
l-i
.
 
&
 
\&
1
 
I
 
^
1
 
2
 
!o
8
 
Ij
^
*
 
i
 ^j
-j
 
1
•u
 
.
 
.
Z
 
i
 (v%
2§
 
j
 
^
'Ss
*
i
1i
104
The complement of input/output functions used within the ARCS computer unit is:
• Analog input conversion and conditioning circuits
• Discrete I/O
• Analog outputs and servoamplifiers
• Serial digital sensor interface
• System test interface
• Cross-channel data interface
In addition to these input/output functions, the iteration timing reference, watchdog monitor,
and servo disengage logic are embedded in the functional input/output area.
The processing of analog inputs is combined on a standard module with an analog output
section. Figure 39 shows a block diagram of this combined analog I/O module.
The analog input interface contains a 32-channel multiplexed A/D converter, 24 ac/dc
signal conditioners, and 8 direct dc inputs. Input conversions are initiated by a CPU-controlled
discrete. Each time the conversion initiate line is pulsed, the input multiplexer sequences
32 analog inputs through the A/D.
The digital results are then stored in the local scratch-pad memory in a DMA mode. Within
the DOIO structure, direct memory access to scratch-pad memory is possible every time
the CPU accesses the program memory. A minimum access rate is guaranteed for the normal
instruction set by the microprogram control of the program memory initiate line. The CPU
may access the local scratch-pad memory for A/D data at any time without experiencing
DMA interference.
The servo output interface is composed of two parts: the output D/A section and the
servoelectronics section. A block diagram of the output D/A section is contained in the
combined analog I/O module diagram in figure 39.
A block diagram of the discrete I/O interface is shown in figure 40. Within the DOIO
structure, discrete inputs appear as individual bits within four dedicated scratch-pad memory
locations. Input level changers are provided to receive up to thrity-two 28-Vdc discretes.
In addition, there is provision for twenty-four 5-Vdc transistor-transistor logic (TTL)
discrete inputs. Each scratch-pad memory read access causes the input discretes within the
selected word to be enabled onto the B bus. The system software is responsible for any
contact debounce processing or double buffering within a frame time.
Discrete outputs are generated simply by storing each discrete within a particular bit position
in one of four dedicated scratch-pad memory locations. Output level changers are provided
for up to eight 28-Vdc discretes. A total of thirty-two 5-Vdc TTL output discretes are
provided.
;105
*
l
*
N
"!
 
c
£
 :i•u>if>«> ,\i-jLHtrh-MUlH- «>
 a
 P
5
 
S
 
1
3
 
—
 
0
ft3a
.
o
-
 Sto•
-H
(A
:
 K:^tJjM
i
 
d
 
*
T
ylINPUTS zsOTEST ENABLE - iA
 
A
fl
CLOCKED
BUFFER
U
l_JUl\
10=1|
i
•I
4-tru.
Krifi•=r(11_i 'U O£|1XV1
.
-1-I
t
 
u
I' 5C
M
~*
 if*
"
»
-
ini
(/gu.t-a3J|M__a 2b
.
i
o
TiuSMSd
t[<<|CODO OOE
 
=
3
 
-
3OSTORAGE
•OOC
NIL0
ifusiMK
„
,
9
 a
e
5
 
A
 
u
.
1
 
O
 
'
 
^
'
 I
 
«
>
 
S
!
 
U
 
>
"flRUlUi 
JJ
*
c
 
"
'*
*;
 
oc
 
u
.
J
 
v
 J7J
 
W|||
 
•
 
J
1
 
V
 
K
O
W
 
K
 
>
;
 d
 
is
s
 d
 ft
-fl
-
 |
 
S
1
U
l
si
 
.
 ?
 »
*
"
 
~
 
•
>
 
U
l
iP
I^
K
a•:
tot—
 f
1
 
S
 
.
,
w
 
S
 3
M
 
•
»
 a
.
T
M
 
(O
tXu2
 
!
o
 
0
 ;
fn
 
'
 
-
^ U
 
C
|(n oAS3
x
=
S3
at
w
w
-
O
H
Dh-U
o
o
a
e
ft
u
§
 
'
<
 
u
-,
.
 Se u
iiIs1.,i
SSod
i5
15KE61STE*
1^§&; |13i
.
^1
O
U
I
S
Q
 
'
X
106
'
A utility interface defined by EIA Standard RS-232C is provided for use by the system
test panel and any ground support equipment. A block diagram of this interface is shown
in figure 41. A single UART (universal asynchronous receiver/transmitter) device provides
the serial-to-parallel/parallel-to-serial conversions and the signal formatting necessary for
the RS-232C serial data link.
A transmission is initiated by the CPU first storing an eight-bit byte (low-order eight bits)
from the C bus in the scratch-pad memory address dedicated to the transmitter and then
executing a CLR instruction for that address. When serial transmission is complete, an
interrupt is generated. The CPU may then transmit another eight-bit byte or mask the
interrupt.
Data is received through the UART and an interrupt is generated. The CPU accesses the
data by performing a read operation on the scratch-pad memory address dedicated to the
receiver. Data is received in eight-bit bytes and enabled onto the low-order eight bits
of the B bus when read. The CPU may mask the receiver interrupts at any time.
5.3.3.3 Cross-Channel Interfaces
The cross-channel data link is used to exchange data between computers for normal mode
operations such as sensor selection and output monitoring and for state variable data needed
for a recovery attempt by one of the channels. The cross-channel receiver contains a 1024-word
RAM that is considered large enough so that no double buffering of any cross-channel
data is required. Each word in the receiving buffer has a unique definition and it may be
processed directly from the buffer. The ability to leave the data in the buffer and not move
it to allow another word with a different meaning to share the receiver location saves a
significant amount of processor time.
The cross-channel transmitter shown in figure 42 contains a 64-word last-in, first-out
(UFO) stack. The stack is composed of a 10-bit label and a 16-bit data word. The CPU
loads the stack with the data to be transmitted and then initiates transmission by executing
a CLR XMIT instruction (XMIT is the select address of the transmitter). Any number of
words from 1 to 64 may be transmitted at one time. If a CLR is executed with no data
in the stack, nothing is transmitted. If at least two words are loaded into the LIFO and
then a CLR instruction is executed, the receiver will start transmitting. The CPU may continue
to store into the LIFO while transmission is progressing; however, no transmission sequence
can be guaranteed using this procedure.
When the stack is empty (when the last word has been read into the parallel/serial transmit
register and transmission initiated), an interrupt is issued so that the CPU may load the stack
with new data. After the last piece of data for a particular sequence is transmitted, the
CPU must mask the transmitter to prevent additional interrupts. If the interrupts are
masked, the device ready flip-flop may be tested by the CPU to determine if the transmitter
LIFO is empty.
The transmitter is selected using a 16-bit address. The high-order 6 bits of the address
are the transmitter select address, and the low-order 10 bits are stored into the label portion
of the stack associated with the data word. This label is transmitted with the word and is
107
to300
r>
5
Q
-
 
0
to
 to
C
£
 
tO
O
 (
-
 U
J
—
 z
 z
x
0
0
 
~
Q
 O
 
-J
crto
co
 ce
.
to
 LU
LU
 a
cr
 o
Q
 o
Q
 L
U
<
 Q
CO3CQ
0
.
 O
Q
U
S
<DC
RECEIVE
MASK FF
TRANSMIT
MASK FF
COGERR
'RCVTRANSMI
CONTROL
LOGIC
IxoeQLU.
UNIVERSAL
ASYNCHRONOUS
RECEIVER/
TRANSMITTER
>
-
Q<LUo:
RECEIVE READY
18ISi c
>
 0
LU
 O
Q
 U
5!
V)LU
CM
 
-I
 
—
«
 
t
o
 t
LU
 ce
 co
 i
—
108
i
 Z5
 CO
S
iQ-rJ
CJ
 OQ
S
a
:
 o
<
 Q
£
rsco
s
Q
.
 
=3
O
B
»
8
i
OO
i
 CO
O
I-U
J
—
•
•Z
.Z
o
 o
 
—
a
 o
 
_
i
<ru_CQ
COCD
 >
COCO
I
-
L
U
O
IX
 U
J
Q
l
-
Q
 L
U
<
a
occ
01
-
u
.
 z
—
 o
—
IC
J
UD1—I\
tsoccUJ
a
:
ccLUm2CO
I-C
O
Q
.
 LU
Z
l
-
U
J
 
<a:
o
 LU
.
 
C
OCOLU
C
D
 
*v
i—
1
0
 Q
x
u
-
 n
5
~
<N-^
ccUJCO
ccoQa:i
I!I
\
C
N
J
(_)z•CO
109
used by the receiver as the relative address, within the receiver buffer, to store the data
word. More than one word with the same label may be transmitted, with the last one
overwriting the previous ones. The transmitter software must be written so that no more
than 64 words are stored into the LIFO. Otherwise, the LIFO address pointer will overflow,
causing the next word to overwrite the first word stored. When the transmission is initiated,
only the number of words indicated by the LIFO pointer will be transmitted. The last word
to be stored in the LIFO is pointed to by the LIFO counter. If the CPU attempts to read
any location in the stack, this is the word it will read.
Cross-channel data is transmitted at a 2-Mhz bit rate. A total of 29 bits are transmitted:
10 label, 16 data, 1 parity, and 2 for transmitter/receiver synchronization. To ensure that
the receiver has time to store the data into three different buffers, there is a 5.0- n s
separation between words. One word is thus transmitted every 19.5^8.
A block diagram of the cross-channel receiver is shown in figure 43. Each of the receiver's
buffers contain IK of memory, and each is hard-wired to a particular transmitter. Therefore,
no selection process is required to turn on the receiver. When a word is received, the 10-bit
label is used as the relative address within the buffer at which the data is stored. The data
is stored in a DMA mode. No transmission word can have a label outside the buffer, since
the 10-bit label may range from only 0-1023. The receiver buffer memory has both read
and write capability, so a word can be modified during processing and stored back onto the
same location. The receiver checks the label associated with each transmission,
and if it is label zero, an interrupt is generated. The transmitting computer may load label
zero at the end of a block of data that corresponds to a particular function. The receiver
will generate an interrupt when all of that data has been received. The CPU may then
process the data without waiting for the whole block transfer to be completed. Since the
transmitter has the capability of loading the LIFO stack with more than one word with
label zero, the data in the stack can be separated into blocks. Each time a complete block
is received, the CPU gets an interrupt, and it can check the word associated with label
zero for a code describing the type of transmission. This interrupt may be masked if not
needed or desired. The ready flip-flop may be tested to see if a label zero has been
received in the case where the interrupt has been inhibited. If a parity error is detected
on the transmission, the receiver sets the word to zero and stores it at label zero. This results
in an interrupt being generated. The CPU may choose to ignore the data or the interrupt.
5.3.3.4 Iteration Timing Reference and Watchdog Monitor
The iteration/recovery interrupt is a local channel priority interrupt. It is generated from
the iteration timing reference, which is a counter that serves as the real-time reference
for the local channel. It may be reset by the local sync command negative transition.
This will rezero or reinitialize the count period. If not reset by the local sync command,
it will continue to toggle count states and thus generate a recovery interrupt.
The watchdog monitor is composed of two sections. The first section measures the time
interval between negative edges of the input signal and generates an output pulse (F)
immediately following a negative edge if the previous period is within specified limits,
as shown in figure 44. The second section is an ac-coupled monostable multivibrator
that will generate a computer-failed discrete (H) after a specified amount of time (1.5
periods) unless reset by an in-tolerance pulse (F).
110
oOC
.
 tV)
O
 
I
—
 U
J
"
-
 Z
 Z
0
0
-
-
Q
 u
 
_J
oce
cr
 u
LU
 <_>
a
:
 uj
Q
I
-
Q
 U
J
<
 a
CO
„
 
\
U
JSERIAL/PAR
C
O
-
 o
o
:
 «
>
 e
>
u
 o
(Q
C
O
XdW
V)COUJaa
Z
 L
U
—
 Q
1
UJ
COX
 
<
LUa>a
oO
f
.
z
o
:
O
 
_
l
o
 u
25
 ii
i-
o
i-.o
ce
 o
<
 o
:!
'
ijO)
,
 111
ao
v
-
»
—
«
 
I—
i
ccs:
0
1
-
a
:
LU
 
u
.
Q0
1
-
I
—
 I
 ^
^
ccs:
LLJ
 
•
-
•
Q
.
 
_J
L
orooIDoao
•IIii.1k.
112
The ac-coupled monostable multivibrator is designed to generate a computer failure
discrete at a specified time interval after an in-tolerance pulse (F) is received, unless it
is "saved" from doing so by another in-tolerance pulse. In this manner the circuit is
continually being "saved" from generating the computer failure discrete, and if anything
happens to prevent pulse F from occurring, the discrete will be generated. This is fail-safe
in that any failure mode of the input signal or the 4-MHz clock is covered.
5.3.3.5 Power Supply
The computer power supply must furnish +5 Vdc logic power and ± 15 Vdc analog/digital
interface power. Figure 45 shows a block diagram of the supply. Primary input power
is +28 Vdc with characteristics per MIL-STD-704A. EMI line filtering and transient suppres-
sion through a zener diode are provided. Conducted emissions are held down to the
levels required by MIL-STD-461 A.
The basic supply operates as a dc-to-dc converter with no intermediate conversion to ac
and, consequently, no transformers. The +5 Vdc logic power is generated directly from +28
Vdc via a switching regulator. The +15 Vdc output is also generated directly from +28 Vdc
but a simple linear series regulator is used. The -15 Vdc output is generated by a transformer-
less positive-dc-to-negative-dc converter. A switching regulator is used to chop +28 Vdc
into a "swinging choke," which stores energy and then transfers it to the output capacitor
via a "half-wave" rectifier.
A power status monitor generates a PS VALID signal for use as a priority interrupt to the
CPU. This interrupt is what initiates automatic power-on processing.
In addition to the regulated dc power used by the computer, ac excitation for LVDT's
in the servo loops is required. It is assumed that 400-Hz excitation is available in any applica-
tion and that only a step-down transformer would be required. However, if OR'ed power
redundancy is to be employed, then a dc-tb-ac inverter operating from an OR'ed 28-Vdc
bus would be required at some point in the system.
5.3.3.6 Built-in Test and Self-Monitoring Functions
Special considerations have been made in all areas of the ARCS hardware design to enhance
the built-in test (BIT) and self-monitor capabilities. In particular, the following hardware
self-monitor and built-in test functions have been designed into the ARCS computer unit.
Watchdog Monitor.-A. watchdog monitor has been included to detect that class of fault
for which the processor is no longer a logical element. The watchdog monitor must detect
the falling edge of the LSC discrete at regular periods of 10 ms ± 1 OOps. The LSC is set
and cleared under software control so the processor may use software-determined fault
status to affect the state of the watchdog monitor. When the watchdog monitor trips,
it disconnects all the servoactuators driven by that computer and provides a discrete to
the processor.
.113
.
 
u
n
 z
>
+
 
+
 
o
 
'
LO
+
 7
o
H
t
-
co
 
o;
ui
 
z
 
o
>
 
—
 i
-
—
 
3
:
 
<
i
-
 
o
 
_
i
—
 
H
-
 
13
CO
 
•
—
 
CO
O
 
2
 
U
J
Q
-
 
co
 
ce
}\\
L
-
>~
 
o
:
»
-
 
<
i—
i
 
UJ
CO
 
Z
o
 
~
Q
-
 
_J
toUJQC
-)|
-
i
 
o
 
a:
U
J
 
U
J
 
Z
 
O
—
 
«
 
i
 
<
I
-
 
i
-
 
u
 
_
i
—
 
<
 
H
 z
s
CO
 
CO
 
•
—
 
CO
o
 
uj
 2
 
u
i
a
.
 
z
 
c
o
 o£
\
^
';/
QCOco
i
 £COLUCO• za:0zUJ
LA
 
L
T
\
 
00
LT
»
 
i—
I
 
i—
I
 
C
M
+
 
+
 
I
 
+
JILL
•
 
^
^
(X
CO
 
O
cc
 
•=>
 \
-
LU
 
K
 
H
-
S
 
<
 
Z
O
 
I
-
 
O
Q
.
 
C
O
 £
"
 >
 
I
+
 
O
.
 
I
0
0
 
Z
CM
 
—
•
^
COa
.
'
 CJ
:
 O
114
Arithmetic Fault Detector.—The arithmetic fault detector monitors all arithmetic operations
in the processor. It detects a fault whenever any of these arithmetic operations—addition,
subtraction, multiplication, division, or arithmetic left shift—attempts to create a number
that is outside the valid representation range of the computer (2's complement fractional
representation). When a fault condition is detected, an interrupt is generated through
the computer fault interrupt.
Power Supply Monitor.—A comparator monitor is used to detect out-of-tolerance conditions
on all three secondary voltages, +5 and ±15 V. An out-of-tolerance condition is used
to disconnect all the servoactuators driven by that computer. A return to an in-tolerance
condition produces a power-on interrupt. The processor is made cognizant of the actuator
connection status through the state of the actuator shutoff valve discretes.
Scratch-pad Memory Parity Generator/Checker.—Odd parity is generated for all data being
stored into the scratch-pad memory (RAM) and checked for all data being read out. If
invalid parity is detected, an interrupt is generated through the computer fault interrupt.
All subsequent activity is software determined. A parity generator inversion discrete
is controllable from within the processor for use in testing the parity generator/monitor
circuitry.
Asynchronous Digital Input/Output Validity.—In-line validity checking (parity plus bit
framing where appropriate) is performed on all asynchronous digital input devices (inter-
channel data link, system test panel data link, and SPBP). Invalid data detection results
in an interrupt to the processor. Those receivers with a RAM capability will not store the
invalid data in the RAM. The system test panel receiver will pass the data to the processor,
with the validity bits in the upper byte of the word.
Discrete Input/Output Loop.—The discrete input/output module has been designed to
provide loop testing capability within the module for all discrete outputs and inputs.
The loops are activated by a self-test enable discrete that can be controlled by software.
Analog Input/Output Loop.—The analog input/output module has been designed to provide
loop testing capability within the module for all analog inputs and outputs. The loops
are activated by a self-test enable discrete that can be controlled by software.
Asynchronous Digital Input/Output Loop.—All asynchronous digital input/output modules
have been designed to provide loop testing capability within the module through the trans?
mitter and receiver(s). The loops are activated by a self-test enable discrete that can be
controlled by software. Inverse parity generation within the respective transmitters is
also controlled by a software-generated discrete.
Independent Servo Monitor.—An independent servo-loop monitor compares the two inde-
pendent servo command signals for each servoactuator, which are summed in the servo-
valve. This monitor detects failures of the multiplex D/A converter and sample and hold
circuits in addition to failures within the servo loop. Upon detecting a failure, the independent
monitor disconnects the servo involved and signals the processor through the state of the
actuator shutoff valve (SOV) discrete.
Ml 5
5.3.4 SYSTEM TEST PANEL
The system test panel (STP) provides the control and display interface necessary for the
flight and maintenance crews to perform various system tests. As indicated previously,
the STP will communicate with each ARCS computer unit via a serial digital RS-232C
data link. Figure 46 is a block diagram of the electronics installed within the panel to provide
the required control, display, and interface.
The operation of the STP may be described two ways: in terms of basic hardware design
functions or in terms of the system user functions that are determined by both hardware
and software. Since the software aspects of system test are discussed in appendix B, this
section will be concerned only with hardware design functions.
As shown in figure 46, the heart of the STP electronics is a single UART device. This LSI
device provides the serial-to-parallel and parallel-to-serial conversions, as well as the signal
formatting necessary for the RS-232C interface with the computers. The same serial data
is transmitted simultaneously from the STP to each computer. One eight-bit word is
transmitted each time a momentary switch is activated on the panel. The rotary mode
control switches on the STP, shown in figure 47, will not cause any data transmissions
when activated by themselves (the STP OFF position will, however, prevent transmissions).
The content of the eight-bit word is determined by the encoding logic following the panel
switches.
The STP receives data in eight-bit bytes from any one of the computers connected to it.
Individual computer valid discretes are used to automatically select which computer's
data is received by the panel. When all computers indicate valid operational status, then
computer A data is the normal receiver selection. Two types of data are received: mode
status annunciation data and alphanumeric display data. The first type is used to illuminate
indicator lamps that confirm momentary switch activations or show status (e.g., the FAIL
and ALERT indicators in fig. 47). The second type of data is an ASCII character string
that provides a message for the alphanumeric display. The Burroughs' "self-scan" display
(with memory) is the type used. The current Boeing requirement is for at least a 12-
character display capability. Each character string message is held on the display until a
new message is received from the computer.
Because the STP is intended to be installed within the flight deck area, it is required to
operate without cooling air in a Category Al environment per reference 3. Internally
regulated dc power supplies provide panel logic power and display power. A dimmer-controlled
28-Vdc source powers all indicator lamps. The prime input power source for the STP
is expected to be 28 Vdc. If necessary, redundant 28-Vdc sources may be OR'ed together
to provide power to the panel in the event of power system failures.
The two basic functions of the STP are to control the test function and to display test
results. The result of combining the airline operational needs with ARCS functional character-
istics is represented by the panel layout shown in figure 47.
The STP will provide the following control features:
116
COa
.
ECu»-*QCL
a
:
u
.
a
.
3
'
m
^
^
r
—
WzaoozUJ
o•
•
*
U)o_l
coUOUUJia.
IISo3I1
<
 
C
O
 
O
117
oo000000ct:
D
i
LU
O
_
S
I
<
LU
Q<LU
I03
-J
1
oo
I
co
LLJou
LUccooo:a.
118
START—A momentary action backlighted pushbutton for activation of the ground test
function for preflight or maintenance purposes. Activation of the test function is acknowl-
edged when the ARCS computers cause the button to be illuminated.
CONFIRM—A momentary action pushbutton by which the crew can confirm that the
action required by the test in progress has been accomplished.
CONTINUE—A momentary action pushbutton by which the operator can cause the system
to resume testing following a programmed stop. This might be the case when, for example,
the computers are ready to begin servo testing but will not proceed until requested to do so.
END—A momentary action pushbutton which, when activated, will cause the ground test
function to skip all remaining tests.
READ—A momentary action pushbutton that will cause the current failure status of the
selected test to be displayed. Repeated activation of this function will cause sequential
messages from computer memory to be displayed.
Test mode selector—A manual rotary switch with four positions: OFF, STATUS,
PREFLIGHT, and MAINT. This switch position causes the ARCS computers to respond
with the corresponding system test function, e.g., STATUS will cause a display of the
current operational status of the system, PREFLIGHT will cause a preflight test to be
initiated when the start button is depressed, and MAINT will select a full ground test.
Test function selector—An eight-position rotary switch that allows the operator the flexibility
of selecting either a test of the entire system or some particular subset of the total system.
Selectable functions include SYSTEM, SENSORS, COMPUTERS, or any of the SERVO-
actuators.
Lamp test—A lamp test button which, when activated, will verify all indicator lamps and
display elements.
The STP will provide the following display features:
FAIL-A red FAIL light that will illuminate when a failure has been detected which is
applicable to the currently selected mode or which results in a loss of dispatch capability
of the airplane.
ALERT—A yellow ALERT light that will illuminate when a failure condition has been
detected which does not affect the current mode of operation or the dispatch capability
of the system.
Readout—A test readout capable of displaying a minimum of 10 alphanumeric characters.
This display will be used to enunciate LRU-level failure information, system status or
operational capability, operator action requirements, or any other information or cues
that require a text-type readout.
119
6.0 ARCS DESIGN ANALYSIS
The purpose and scope of the analytical work performed to synthesize an ARCS design
concept and establish its fault-tolerant characteristics is illustrated by figure 48. The
illustration shows the relationships of activities involved in a complete development of a
fault-tolerant system from design to actual system (hardware and software) testing for
design verification. Activities bounded by solid frames were within the scope of the
ARCS program. The design synthesis resulting in the system concept described in section
5 is summarized in appendix D.
The first part of this section (sec. 6.1) deals with the development of analytical tools for
measuring or assessing the fault tolerance of a redundant computer system, a primary
activity within the ARCS program. This part describes the approaches and results of the
fault analysis, reliability model synthesis, and probability projections performed for the
ARCS and the baseline WWCS.
Closely related to the specification of fault-tolerance requirements, and to the analysis
performed during the design phase, is the verification of the fault-tolerant performance
of the fully mechanized system; refer again to figure 48. During the ARCS study, General
Electric, working closely with Boeing, developed a new approach for measuring coverage.
This approach is based on a failure analysis of a randomly selected set of failure modes
extracted from the entire failure mode population. The main theme of this method is
discussed in section 6.1.
The second part of this section (sec. 6.2) is an assessment of the cost effectiveness of applying
ARCS technology, carried out in two parts: an analysis of airline cost-of-ownership for
an ARCS maintained in a Category III operational status and an analysis of the cost effect
of providing an integrated system test function.
6.1 FAULT-TOLERANCE ANALYSIS
The delineation of the ARCS fault-tolerance analysis is organized into three parts, as
illustrated in figure 49. The fault analysis is a prerequisite for the synthesis of a reliability
model, and the reliability model is used to derive success probability projections for a
given design with a given set of input assumptions.
Section 6.1.1 discusses the approach used, and the functional simulation applied, to establish
confidence that the ARCS design will indeed provide fault-tolerant performance. The
reliability modeling description, introducing the computer programs used to compile
the WWCS and ARCS reliability estimations, and the results of the WWCS and ARCS
reliability estimations are presented in section 6.1.2.
6.1.1 FAULT ANALYSIS
In general, system fault analysis has two primary purposes: to aid in developing a workable
system concept during the design phase and to validate the system design once the design
is complete. Only the first applies in the ARCS study, where the particular objectives of
120
<b&3
j
«uuCO
£!5_gsa
c
 
'
oa>
 >
2
'
Q
.
15c1$€18IOCIPOI
121
COOEToE
2D
>
cc
.
COOl
E£
trCODC<O
I!t1•S3Qc
o(D
,sQ
.
a(O(O<
o3ECOc.O'*-»ucu.
122
the fault analysis task were to (1) develop a system concept that guarantees first-failure
survivability and (2) evaluate the viability of the developed reconfiguration processes.
The ARCS fault tolerance is founded on the use of three or more autonomous redundant
computers exchanging data via digital data buses, each providing a mechanical output
into a mechanical voting mechanism having complete fault tolerance for first failure.
The technology to realize the complete isolation from nonintentional cross-channel effects,
and to realize a sufficiently reliable mechanical voter, was judged to be state of the art.
With the above assumptions, the following breakdown of detectable fault classes can be
made based on the relationship of the fault to the different monitor functions presented
in the ARCS design concept of section 5.
Fault class Associated monitor
Any failure causing a processor not to Watchdog monitor
complete tasks on schedule
Any failure causing a processor to produce Output monitor
dissimilar results
Any fault upstream of the virtual SSFD SSFD
voting node
Any fault downstream of the computer Servo monitor
output monitor node
Electrical power loss (Trivial)
Any failure causing corruption of Cross-channel monitor
cross-channel data between two
computer units
For the level of design detail that the scope of the ARCS program allows, a further break-
down of fault classes is not meaningful. The purpose of the ARCS fault-tolerance analysis
was therefore to validate that the design concept will handle the classes of faults listed
above through the range of triplex, duplex, and simplex states, as well as transient, fault
recoveries in each of these redundancy states.
For that purpose, the ARCS concept was analyzed on paper and was implemented into
a functional simulation—the Reconfigurable Computer System Simulation (RCSS)—
representing three synchronized but autonomous computers processing the algorithms
required to perform all the ARCS processes that are significant to the fault-tolerant
capability of the system. The simulation was used as a tool in the fault analysis to help
evaluate the reconfiguration processes. These processes, though each conceptually under-
standable, are difficult to validate using only a paper analysis.
:123
6.1.1.1 Paper Analysis
Several levels of paper analysis can be performed on a design concept such as the ARCS.
The synchronization process was examined as a part of the design analysis (see app. D)
to ensure that the selected algorithms met the basic requirements. The initial synchroniza-
tion was examined from a system point of view (see sec. 5.1) to ensure that the :
selected algorithms accomplished the system requirements as prescribed in section 4.2.
The recovery strategy of table 3 was analyzed and used to specify the sequential machine
of figure 50. The 16 states of figure 50 represent the acceptance settings of the transient
and permanent failure flags defined in table 3. In this figure, a state such as qj(LXY)
indicates that the transient failure flag of Y is set. (See table 3, third channel synchronized
before sensor settling.) A state such as q4 (LX), Y not present, indicates that the Y
permanent flag is set. (See table 3, two-channel operation after sensor settling.)
If the local computer determines that it is simplex, it will start in state q j 5 (L). If it
determines that another computer is already operating (i.e., duplex recovery), it will
recover to it and start in states q$ (LX) or q^ (LY), depending on whether the other com-
puter is to its left or .right. If it determines that two other computers are operating,
it willrecover to one of them. If no transient failure flag is set, the local computer begins
operation in q^ (LXY). If a transient failure flag is set, it will begin operation in qg (XY)
or qo (XY).When the flag is cleared, it will transition to q-y (XY) from which recovery to
(\2 (LXY) can then proceed. Had the fault associated with qg or qg been declared permanent,
the local computer would have transitioned to q^ (X) or qj4 (Y), from which recovery
to q<j (LX) or qp (LY could proceed.)
In figures 51,52, and 53, the table at the top of each figure gives the sequence of states
through which the software would transition from each computer's point of view. The
first column specifies the local (L), left (X), and right (Y) channels.
Figure 51 shows the power-on sequence for the three computers. Computer No. 1 comes on,
notes that it is alone, and proceeds to operate in simplex. Computer No. 2 comes on, notes
that No. 1 is operating and recovers to it. Computer No. 3 comes on,^notes that Nos. 1 and
2 are operating and that No. 2's do-not-use flag is set, and waits for the flag to clear before
attempting recovery. The key point here is that as long as a transient failure flag is set,
the third computer will not attempt to recover. If computer No. 2 fails to recover, it
will be permanently faulted and No. 3 will attempt to recover, as shown in figure 52.
Figure 53 shows the case where No. 2 recovers and No. 3 starts recovery but No. 2 detects
a fault. In this case, No. 3 is faulted. However, since No. 3 will assess that it is fault free,
it will get another chance at recovery as soon as the fault assessment in No. 2 is resolved.
Figure 54 shows the case where the system function is triplex and a fault is detected and
declared permanent.
6.1.1.2 Simulation Results
The ARCS simulation using the general simulation program RCSS (Redundant Computer
124'
Table 3.—Recovery Strategy
OPERATION
TRANSIENT
FAILURE
FLAGS
PERMANENT
FAILURE
FLAGS
SIMPLEX
SENSORS COMPUTER SERVOS
A I 1 A I 1 T
A B C
0 0' 0
A B C
O i l
AFTER SECOND CHANNEL SYNCHRONIZED
'BEFORE SENSOR SETTLING
0 1 0 0 0 1
AFTER SENSOR SETTLING
0 0 0 0 0 1
THIRD CHANNEL SYNCHRONIZED
BEFORE SENSOR SETTLING :
0 01 0 0 0
AFTER SENSOR SETTLING
0 0 0 0 0 0
125
LEGEND:
L - LOCAL
X - LEFT
Y - RIGHT
Figure 50.—Local Computer's Assessment of a System Function
126
~" frame
LXY ^^
132
213
321
1
q!5
—
—
2
"15
—
—
3
"11
"5
—
4
"11
"5
q8
5
'11
"5
q8 ,
6
V/qlQ%%
7
«3
ql
q2
8
q3
qi
q2
9
^3
ql
q2
10
qo
qo
qo
11
qo
qo
qo
12
qo
qo
qo
TRANSIENT
PERMANENT
(#1)
(#2)
TRANSIENT
PERMANENT
V//MV///////M//M/A . . . . . .
7////////////A
#2 RECOVERS
#2 SYNCS AND
STARTS RECOVERY
(#3)
TRANSIENT
PERMANENT
LEGEND
L -
X -
Y -
LOCAL
LEFT
RIGHT
W///////////////////A
V
Y/////#//////tf///////////////ffiffilfflfflL . . . ,
1 1 *
#3 SYNCS #3 STARTS #3 RECOVERS
AND WAITS RECOVERY
Figure 51.—Power-On/Watchdog Monitor Trip Recovery as Processed by
an Operating Computer (Triplex Operation Attained)
127
frame
LXY
132
213
321
1
"15
—
—
2
"15
—
—
3
qll
"5
—
4
"11
'5
"8
5
"11
"5
"8
6
"1L/
/ql§
q5/
/q1,lMvAu
7
°-u
q6
"9
8
q!2
q6
"9
9
q!2
"6
"9
10
"4
"7
"10
11
"4
"7
"10
12
"4
"7
"10
TRANSIENT
PERMANENT
i i 1 i > 1 • . 1 1
i * f 1 l - i 1 i i i 1
TRANSIENT
PERMANENT
TRANSIENT
PERMANENT
LEGEND:
L - LOCAL
X - LEFT
Y - RIGHT
(#2)
#2 SYNCS AND
STARTS
RECOVERY
(#3)
FAILS TO RECOVER
#3 SYNCS
AND WAITS
STARTS #3 RECOVERS
TO RECOVER
Figure 52.—Power-On and Watchdog Monitor Trip Recovery as Processed by
an Operating Computer (Duplex Operation Attained)
128
"--••^.frame
:LXY -^
132
213
321
1
"15
—
—
2
"15
—
—
3
"ll
"5
4
W
/"ID
%
%
5
"3
"l
"2
6
vA!MI//^%
7
"11
"5
*8
8
qll
"5
"8
9
"11
"5
"8
10
qll
qs
qa
11
"15
"13
"8
12
"15
"13
"8
TRANSIENT .
PERMANENT
TRANSIENT
PERMANENT
TRANSIENT
PERMANENT
LEGEND:
L - LOCAL
X - LEFT
Y - RIGHT
(#2)
4
#2 SVNCS
AND STARTS
RECOVERY
(#3)
4
#2 FAULTS
#2 RECOVERS
#3 SYNCS -^ f *- #3 FAILED
#3 SYNCS AND STARTS RECOVERY
FAILED
Figure 53.—Power-On/Watchdog^Monitor Trip Recovery as Processed by
an Operating Computer (Simplex Operation Attained)
129
' --v.frame
LXY^ "--.^
132
213
321
1
«0
qO
qo
2
qO
qo
qo
3
qo
qo
qo
4
ql
q2
q3
5
ql
q2
q3
6
ql
q2
q3
7
ql
q2
q3
8
q4
q7
qio
9
4^
q7
qio
10
^
"7
^10
11
q4
q7
qio
12
q4
q7
qio
TRANSIENT
PERMANENT
TRANSIENT
PERMANENT
t
#2 FAULTS #2 FAILED
TRANSIENT
PERMANENT
LEGEND:
L - LOCAL
X - LEFT
Y - RIGHT
(#3)
Figure 54.— Power-On/Watchdog Monitor Trip as Processed by
an Operating Computer (Triplex-to-Duplex Degradation)
130
System Simulation) was used to (1) explore various recovery concepts and (2) verify
reconfiguration design completeness.
The simulation treats parallel real-time processing for the redundant channel configuration
sequentially, subprocess by subprocess. After the data resulting from the real-time-referenced
simulation processing of an ARCS subprocess has been sequentially performed and exchanged
between the three simulated processors, processing for the next subprocess between data
exchange points is performed, and so on until the frame processing is completed. Appendix
E shows the use of the simulation to demonstrate the power-on/power-fault reconfiguration
process and illustrates how the simulation was used to help verify the ARCS design concept.
In the area of recovery, two concepts were explored. The first was that of each system
function handling complete recovery for that function. This concept gives functions such
as the SSFD access to the channel synchronization status. Although workable, this concept
was found to contribute to extremely complex data management.
The second concept—a hierarchy of recovery—allows the consolidation of recovery and
redundancy degradation into one function, based on synchronization status and cross-channel
monitor results. If synchronization is not attained or the cross-channel data link transmits
erroneous data, the offending channel's permanent flags will be set by the recovery/redundancy
degradation process.
The hierarchy of recovery concept was selected because it reduces data management and
should ultimately reduce execution time by terminating recovery at the first level that
recovery is impossible. The savings in data management would be realized since lower level
processes, such as the SSFD, would not be required to have access to the synchronization
status. This should reduce the complexity of the data interface. Execution time should
also be reduced, since the subroutine calling sequence would be simplified. However, the
importance of any reduction can be assessed only after software implementation.
Design completeness was verified by simulating a number of cases and observing the result.
Due to cost and time, this verification was not exhaustive, but it did show the benefit
of such processing by pointing out both design and implementation errors.
The magnitude of the reconfiguration problem was brought out mainly by the shear volume
of data that must be recorded to analyze results. Appendix E describes several power-on
and power-fault simulations.
6.1.2 RELIABILITY ANALYSIS
The ARCS reliability analysis had two main objectives: (1) to survey available reliability
assessment tools and establish a reliability assessment technique applicable to the analysis
of a redundant reconfigurable system and (2) to exercise this technique to predict reliability
numbers for the WWCS and ARCS baseline configurations and for several ARCS trade
study configurations.
Section 6.1.2.1 discusses some currently available reliability estimation methods and gives
an account of the considerations that led to the development of a new reliability assessment
.131
tool, CARSRA (Computer Aided Redundant System Reliability Analysis). CARSRA is a
FORTRAN program specifically developed during the ARCS program. A brief account
of this program is given below and a detailed description may be found in appendix F,
which also contains a user's guide.
Section 6.1.2.2 describes the essence of the new method developed to assess coverage
parameters. Details of this method are contained in appendix G.
Section 6.1.2.3 presents the main reliability results and conclusions from the ARCS/
WWCS reliability study. A detailed account of the background data needed for the analyses
is given in appendix H.
Section 6.1.2 A summarizes conclusions and observations from the ARCS reliability analysis.
6.1.2.1 The Reliability Assessment Technique
The ARCS exhibits several unique properties that complicate the reliability analysis, one
of which is the ability to degrade from triplex to duplex redundancy following a first
failure, and from duplex to simplex upon the majority of all second failures. The conditional
probability that the system survives given a second failure is denoted "the second-failure
coverage" for the system. This parameter is of great significance in the reliability assessment
and should be incorporated into any viable reliability model.
Another property of ARCS is its ability to survive most transient faults. Modeling of transient
faults further complicates the reliability analysis. Although nonsurvivable transients con-
ceivably could be modeled by increasing the permanent equipment failure rates, this increase
would have to depend on the operational redundancy level since the system is more likely
to survive transients at triplex redundancy than when operating in duplex.
A special requirement for the ARCS reliability study was the need to evaluate system
functional readiness as well as system failure probability. Functional readiness is the proba-
bility of operating at some prescribed system redundancy state (defined as the functional
readiness criterion) at a particular time. This parameter is of interest for fault-tolerant
systems that are capable of sustaining module failure(s) without losing the availability
of the system function. For the ARCS, functional readiness is of interest in the context
of permitting Cat III or STOL landings with certain parts of the flight control system
failed. This will have the benefit of substantially increasing the Cat III and STOL function
availability.
Survey of Available Reliability Assessment Tools and Approaches. —There exists a rich
proliferation of reliability assessment tools, most of them in the form of digital computer
programs. Table 4 displays analysis methods used by 10 different reliability estimation
programs, 6 of which are currently in use at Boeing. Descriptions of the CARE, CARE2,
CAST, and TASRA programs were obtained from NASA/LRC.
Table 5 shows different program features. Of the methods shown in table 4, the Markov
model and simulation are the only techniques capable of handling the special requirements
of the ARCS analysis.
132
Table 4.—Surveyed Reliability Programs and Associated Analysis Methods
Program
name
ARMM (B)
BINO (B)
CARE
CARE2
CAST
CEPFRA(B)
COBRA (B)
CRAM (B)
SIM (B)
TASRA
Method
Event
tabulation
X
X
X
Binomial
expansion
X
Boolean
algebra
X
Conditional
probabilities
X
(B) = programs used by Boeing
Standard
configurations
X
X
Markov
model
X
X
Simulation
X
r
X
Table 5.---Reliability Program Features
1
Program
name
~~~ ARMM~(BJ~ !
BINO(B)
CARE 1
CARE2
CAST
CEPFRA(B)
COBRA (B)
CRAM (B)
SIM (B)
TASRA
! Feature ;
Stage
• dependencies
X
(X)
(X)
X
Active
redundancy
X
X
X
X
X
X
X
X
X
X
Standby
redundancy
X
X
X
X
Permanent
fault
coverage
'
(X)
X
X
X
Transient
fault
recoverage
=
X
X
X
Multiple '
dissimilar
failure
modes
X
(X)
X
X
(X) = limited ability
(B) = programs used by Boeing
:i33
The CAST program was developed by Ultrasystems under a contract from NASA/LRC.
It is a two-step approach that uses a Monte Carlo simulation to estimate parameters in a
Markov model describing the system. This Markov model is in turn used to evaluate
system reliability. The approach presumes that the system is simple enough to permit
modeling without having to deal with an exorbitant number of Markov states.
CEPFRA is a Boeing program based on the Markov approach. Like CAST, it presumes
a relatively small number of Markov states (<35). Since several hundred states would be
needed to model the ARCS by a single Markov model, the applicability of the CAST
and the CEPFRA programs to the ARCS reliability assessment task is limited.1
ARCS-Developed Analysis Tools.—Because of the limitations of currently available comput-
erized reliability analysis tools, the decision was made to develop a new tool—CARSRA
(Computer Aided Redundant System Reliability Analysis).
CARSRA is a general-purpose reliability analysis program that handles modular-redundant
reconfigurable systems taking into account such factors as coverage and transient faults.
It evaluates functional readiness and system failure probabilities for two different failure
modes, using a unique approach that combines Markov modeling with dependency tabulation.
As a result, the dimensional problem of the Markov model is overcome by system partitioning
into stages, or sets of redundant identical modules, so that each stage may be modeled
by a dedicated Markov model of low order. The details of this approach as well as a guide
to the use of the program may be found in appendix F.
In the process of developing CARSRA, a simplified version was written called RESRA
(Redundant System Reliability Analysis). RESRA uses the same system-partitioning approach
as CARSRA, with the difference that the stage reliabilities are expressed by closed-form
analytical expressions rather than via a Markov model. In addition, RESRA does not compute
j functional readiness. RESRA requires less input information and is therefore easier to use
than CARSRA. CARSRA was used to generate all the reliability data presented in the
following.
For the purpose of the reliability analysis, the system is partitioned into stages and modules,
where a module is defined as a set of elements performing a specified function. Each
stage is modeled by a Markov model describing the different redundancy states of the
stage. The two last states (assigned the highest numbers) in the Markov model are stage
failure states. The next to the last state corresponds to a detected failure and the last
state to an undetected failure. Up to nine states in the Markov model are permitted for
1
 each state. An example of a stage Markov model is given in figure 55.
l
Depending on the way in which the various stages have been defined, some stages will
1
 generally have the property that failure of one module in the stage will cause loss of function
I of a module in one or several other stages. This introduces a statistical dependency between
l the various stages that has to be taken into account in the analysis.
134:
NO FAILURES
ONE FAILURE
STAGE FAILURES
TWO FAILURES
DETECTED UNDETECTED
Figure 55.—Example of Markov Model of a Triplex Stage
The system dependency structure may conveniently be described by a dependency tree
diagram, an example of which is shown in figure 56. Dependencies between the various
stages are indicated by the lines connecting the stages, and the "direction" of a dependency
is indicated by an arrowhead. The dependency tree may be used to establish the dependency
table by which the system dependency structure is specified for CARSRA. the numbers
to the right of the stage blocks of figure 56 refer to the modular redundancy level of the
stage.
Functional readiness, FR (t), is defined as being the probability that a certain prescribed
function is available after some time, t, of system operation. Thus, the FR is equal to
unity at time zero and decreases toward zero for long exposure times. System functional
readiness for selected module failure patterns will be computer by CARSRA. If, for
135
PROC, &
MEMORY
MPX &
A/D
R/A
LONG
ACC
COMPASS
COUPLER
DG
COMP,
VG
CONTR,
FORCE
Figure 56.—Flight Control System Dependency Tree
136
example, n different module failure patterns are specified, the program will compute the
probability of having each of these failure patterns at a time t: FRj(t), i=l, 2, 3 . . . n.
The sum of these probabilities yields the functional readiness, FR (t), corresponding to the
functional readiness criterion consisting of the n specified module failure conditions:
n
FR(t) = 2 FRj(t)
i= 1
In this context it is also of interest to assess the probability of system failure given a certain
functional readiness criterion. CARSRA computes this quantity. The probability of system
failure at a certain time T from the beginning of the critical mission phase (for example,
from alert height in a Cat III landing) is computed given each of the n module failure
configurations in the functional readiness criterion. Let these failure probabilities be
FPj(T). The system failure probability FP(t, T) given a prescribed functional readiness
criterion is then evaluated by:
n
S FRj(t) • FPj(T)
FP(t, T) =llJ
FR(t)
CARSRA evaluates FP(t, T) for two different failure modes and any selected t, T pair,
as well as the functional readiness FR(t).
Functional readiness is strictly a system property which assumes that the system is fault
free at time zero and that no maintenance action is performed in the interval [0, t].
Functional availability differs from functional readiness in that maintenance is taken
into account. The availability has a constant value as soon as a certain maintenance
schedule is specified. It may be interpreted as being the average functional readiness
given a prescribed maintenance strategy. To clarify this, assume for simplicity a maintenance
strategy based on periodic maintenance with a maintenance interval Tj^. The probability
of having the system operational at a certain redundancy level, i.e., the functional readiness,
will vary between each maintenance time. Figure 57 displays the probabilities of (a)
no module failure (FRg) and (b) a single module failure (FR]) as a function of time.
The availability, AV, of the system assuming a functional readiness criterion, equivalent
to all modules operational or a single module failure, is then the average:
AV = —L_ f (FR0(t) + FR](t))dt = FR0 + FRj
TM 0
137
U
J
C
N
J
ee
u
_
LU
I
 C
M
CO
LL
.
a>
•acV)
II0}sII1,o1
138
The system failure probability, given the above defined functional readiness criterion
and maintenance strategy, is given by:
T\7./
TM
FP(t) = J_ I FP(t,T)dt
TM 0'
ARCS Reliability Study Components.—Two major tasks must be carried out before the
CARSRA input data may be identified. The first is to establish the system model, and the
second is to assess the model parameters. An overview of CARSRA reliability assessment
components is given in figure 58.
An important assumption for the ARCS reliability study concerns the effect of power
failures. Aircratt power buses, in general, exhibit large failure rates (on the order of
0.5 x 10~3/hour). This failure rate will usually be larger than the failure rates of most of the
equipment served by the bus. In systems where a high level of reliability is demanded,
bus redundancy is often employed whereby the function of a bus is backed up by another
bus or power source. In effect, power bus redundancy will make each power bus fail-
operational so that it takes two consecutive failures before power on any one bus is lost,
making the effective failure rate of the bus negligible in comparison to that of the equip-
ment served by the bus. Possible ways of implementing bus redundancy are:
• Switching between buses
• Backup batteries
• OR-ing of dc buses
• Dual ac inputs
All of these approaches have been used in the past, and it is safe to assume that redundancy
will be designed into any new commercial transport power system.For the ARCS reliability
study, the assumption was therefore made that the influence of power failures may be
ignored.
The system modeling task begins with partitioning the system into stages where a stage,
in the ARCS list of definitions, has been defined as being a set of identical redundant
modules. The second step consists of defining the dependency structure in the form of a
dependency tree diagram, which identifies all dependencies of the type "when module
A fails, it will cause loss of function of modules B, C, D, etc." A Markov model is then
established for every stage, describing the possible operational states of the stage and
stage failure states (detected and undetected).
System dependency tree diagrams for the ARCS near-, intermediate-, and far-term
application models are shown in figures 59 through 61. The interpretation of this type
of representation was described above. Each stage is represented by a block. The significance
of the numbers associated with the blocks is as follows:
139
toCOLUa:
etc00
IoI5
toonUJUJt*Q
.
CD
 
_|
t^£
 U
J
5£
 Q
<
j£
0
111111ii1
z
H
O
iE<t
 co
 co
s:z
 LU
U
JQ
:
 
<
o
_f
-
 tt:
1
—
 O
Z
 
•""•
CO
 CO
 CO
«
 1
-
CC
 (X
 
<
l-H
o
:
-
£00
|
>
-
I
 —iOQ<
 
_
J
«
 U
J
LU
 O
o:
 s:
111L1111L
.
.
C^_)
 
LU
z
 o
r
LU
 rs
Q
 K
z
 u
LU
 
rD
OL
 
o:
L
U
I
-
Q
 C
O
>cr>
 LU
 
_i
\>
 
C
J3
 LU
o
r
 <
c
a
<
r
 f
-
 o
si
 
co
 s
:
1
-
 J
140
PROC, X
MEMORY!
320/.05/.10
18/.05/.1
R/A
21
310 /O/O = Xp/r24/r25
325 /.01/.01
100/.40/.75
10/.25/.25
NORM 25
ACC 10/.25/.25
190/.25/.40
10/.25/.25
240/.25/.40
50/.25/.40
130/.25/.40
22 /.5/.5
NOTE
130/.05/.1
FOR CWS, DELETE R/A AND ILS
Figure 59.—Near-Term ARCS Dependency Tree
-HV
141
PROC &
MEMORY
D/D ?,
J
STDWN
ACC
10/.25/.25
LONG
ACC
in/.?5/.?R
3
*
60/.05/.1 130/.05/.1
NOTE: FOR CWS, DELETE R/A AND ILS
Figure 60.—Intermediate-Term ARCS Dependency Tree
142
R/A 21
310/0/0
MLS
325/.01/.01
RIW1 GV TO 23
50/.4/.75
•••{ LEFT GYRO ?*| 0
50/.4/.75 .
31-©—
110/.OS/.1
Figure 61.—Far-Term ARCS Dependency Tree
REDUNDANT
REDUNDANT
REDUNDANT
143
! Stage
i identifier
•23
Stage no. used by CARSRA
r"I Redundancy level
100/.4/.75
fi! Failure rate: 10 /hour
r35
The circles to the right identify system functions all of which may, or may not, be required
for a certain operational mode. The assumption is made for the near- and intermediate-term
model that all stages but the radio, altimeter I1, and ILS are needed for the CWS mode.~In"
; the far-term system the following servo functions are assumed to be redundant:
• Upper and lower rudder
• Left aileron and left spoiler (or left tip spoiler)
• Right aileron and right spoiler (or right tip spoiler)
A detailed evaluation of the transition rate ratios ^4 and ^5, a difficult and time-consuming
task, was outside the ARCS program scope. Conservative estimates of those parameters
based on engineering judgment and several available sources were therefore used in the
reliability analysis. These sources and the estimated transition rates are summarized in
appendix H.
The Markov model assumed for the near- and intermediate-term ARCS is presented in
figure 62. There are five states, three success and two failure states. A crucial assumption
implicit in this model is the absence of a single-point failure mode, i.e., the absence of
a transition from state 1 to a failed state. Thus the system is assumed to survive all possible
single failures, an assumption that has to be carefully verified by a failure mode effect
analysis (FMEA). The stage may or may not survive a second failure, as modeled by the
transitions from state 2 to 3 and from state 2 to 4, respectively. All second failures are,
however, assumed detectable by comparison of similar signals. The rate of transitions
between states is specified by Xy, which is partitioned into a permanent failure component
and a transient failure component. The permanent component, Xpj • r^, is the product
between the exit rate from state i due to permanent failures, XDJ, multiplied by the ratio (or
fraction) of these transitions that goes to state j. Similarly, the transient component, Xyj. £|j
is partitioned into a transient transition rate, X-p-, and a "leakage" factor, %^.
The failure coverage, q, for state i, is defined as a sum of rate ratios, rjj, for the transitions
;
 to success states j:
144
NO" FAILURE
ONE FAILURE
TWO FAILURE
12
STAGE FAILURE
DETECTED UNDETECTED
A ij A pi • rij + ATI «
r23 = SECOND FAILURE COVERAGE
r
35
 =
 UNDETECTED FAILURE RATIO
Figure 62.—Near-Term/Intermediate-Term Markov Stage Model
145
cj = S rr
; j=success states
In the model of figure 64, we have cj = 1, c^~ r23> anc* C3 = *-*•
The Markov model for the far-term quadruple stages is shown in figure 63. This model
is similar to the triplex-stage model, the only difference being that the quad stage is assumed
to survive all possible combinations of two failures.
After establishing the system dependency tree and the Markov models for each stage, the
next step is to assess the Markov model parameters, Ay, which, as was seen above, may be
partitioned into permanent failure rates, \^; rate ratios, r^; transient fault rates, A^; and
leakages, ,£jj. For the ARCS analysis, the most significant of these are the coverage
parameters ^3 of figure 62 and ^4 of figure 63, which model the performance of the
ARCS duplex-to-simplex redundancy degradation process. The estimation of the coverage
parameters is of fundamental importance in the analysis of an ARCS-type system. Unfortun-
ately this assessment is a nontrivial task, in particular for a digital system, due to the
vast number of possible failure modes. However, a promising new approach was developed
by GE, in close cooperation with Boeing, during the ARCS program. The essence of this
method is described below in section 6.1.2.2. A more comprehensive account is given in
appendix G.
A significant effort during the ARCS reliability analysis also went into estimating the
transient-related Markov model parameters. Four different sources of transient faults were
considered: electrical power system transients, lightning, hydraulic pressure transients,
and sensor nuisance failures. Since the latter failure source has been particularly troublesome
in comtemporary systems, the emphasis in the analysis was directed toward the prediction
of sensor nuisance failure rates. It was shown that only a few particular sensors would
be troublesome and that the nuisance failure rate could be controlled even for these sensors
by proper design of the signal selection and failure-detection algorithms. Since the presentation
of this analysis is rather lengthy, it has been deferred to appendix H, which also contains
the rate ratio assessments performed for the various stages.
6.1.2.2 A New Coverage Assessment Method
Compared to a triple-modular redundant (TMR) system like the WWCS, the ARCS achieves
improved fault tolerance by being able to survive the vast majority of all combinations
of two failures because it is able to reconfigure from duplex to simplex operations. The
degree of success for this reconfiguration is given by the "coverage for a failure in duplex,"
which is the conditional probability of system survival given a like failure when the system
is operating at duplex redundancy. It was seen above that when the Markov model repre-
sentation is used, any coverage parameter may be expressed as a sum of rate .ratios. In
the Markov model representations of figures 62 and 63 for the triplex and_quadruplex
stage, this sum of rate ratios degenerates to a single rate ratio (^3 and ^ 4, respectively).
These rate ratios, one for each of the various stages, influence the system failure probability
1461
NO FAILURES
12
ONE FAILURE
TWO FAILURES
THREE FAILURES
STAGE FAILURE
DETECTED UNDETECTED
Figure 63.—Far-Term Markov Stage Model
\ 147
very significantly. As a matter of fact, it can be shown that the probability of losing a
stage function is proportional to a factor (1 - ^ 3) for a triplex stage and (1 - r-^) for a
quadruple stage. From this it is clear that the assessment of coverage parameters is of
central importance in the reliability analysis.
The ARCS duplex-to-simplex reconfiguration process has three phases: failure detection,
failure localization, and redundancy degradation. The primary means of failure detection
is comparisons between like signals. Failure localization may be accomplished by several
methods, depending on the nature of each particular stage function. Generally, a combination
of hardware and software monitors is used. Redundancy degradation to simplex is achieved
by successfully isolating the failed module without significantly disturbing the system
function.
Of the three phases described above, failure localization is the most crucial, since the ability
to localize a failure generally determines the level of coverage attained. The problem of
estimating coverage for a computer is therefore closely related to the capability of localizing
failures via software self-test and/or hardware monitoring. There is, however, a significant
difference. The capability of localizing a failure is generally given by indicating the percent
of all failures that will be detected. Coverage, on the other hand, may be expressed as a
ratio between transition rates. The relation between coverage, c, and failure localization
percentage, d, is expressed by a simple formula
•H o
c = d "^— • 10'2
xtot
where Xj is the average failure rate of the localized failures and Xtot is the average failure
rate of the total failure populations.
The expression shows that in order to assess coverage, it is necessary not only to determine
whether a certain failure mode is detected by self-test but also to estimate the failure rate
of each failure mode.
It was shown (see app. G) that sufficient data is presently not available on failure modes
of digital devices. An exact coverage assessment is therefore impossible. However, if the
assumption is made that X^ > \ot> ^e'j ^at t^ie avera8e failure rates of the covered
failures is greater or equal to the failure rate of all failures, coverage could be estimated
by
c > d • 10'2
In appendix G it was shown that the assumption X^ > Xtot is reasonable if the set of
covered failures constitutes a random sample from the set of all possible failures.
With this assumption, the problem of estimating coverage reduces to assessing the portion
of all failure modes that will be detected by self-testing. In a digital system this is a formidable
task because of the high number of possible failure modes. Even if the analysis is limited
148
only to "stuck at" failures, the number of failure modes to be considered, may often be
on the order of tens of thousands. An exhaustive analysis will therefore be impractical, if
not impossible.
To circumvent this problem, it was suggested that coverage be estimated from a sample
of randomly selected failure modes. The quality of the resulting coverage estimate will
then depend on the sample size. A confidence interval could be constructed for the estimate
with a width depending on the sample size much like in the theory of random sampling.
The idea of assigning a confidence interval to the coverage estimate is in line with the
general practice of giving component failure rate data with 60% or 90% confidence.
The approach outlined above was tested by estimating coverage for the ARCS servo state.
This stage was selected rather than the computer stage because the servo stage could be
analyzed by a conventional manual FMEA. The results of the sampling approach could there-
fore be compared with FMEA results.
The close agreement found between the two methods supports the validity of basing coverage
assessment on a set of randomly selected failure modes using a methodology borrowed
from random sampling theory. The reader is referred to appendix G for further details.
6.1.2.3 Study Results
This section contains the major reliability results generated by CARSRA for several system
configurations ranging from the WWCS to the far-term ARCS application. Two main
elements were involved in the application of CARSRA. The first, and most substantial,
was the establishment of the various parameters needed to evaluate each configuration,
and the second was the actual running of the program followed by the collection and
presentation of results. These results and their relation to stated ARCS requirements and
goals will be presented in this section; the rather lengthy and cumbersome considerations
that went into the definition of various model parameters are given in appendix H.
Three baseline ARCS application models were established early in the, ARCS program effort:
the near-term, the intermediate-term, and the far-term applications. These models were
to be compared with a control system configured around the WWCS MCP-703 computer
system. To provide an equal basis for comparison, the WWCS was assumed to use sensor and
servos identical to those of the near-term ARCS application.
Reliability data generated during the reliability/trade study included the following:
Baseline Studies:
• WWCS and ARCS baseline near- and intermediate-term CWS mode system failure
probabilities
• WWCS and ARCS near- and intermediate-term autoland mode availability and failure
probabilities
• Probability of a diversion and system failure probability for the ARCS far-term
application
149
Trade Studies:
• Reliability data for four different voting node configurations
• Transition rate ratio sensitivities for three ARCS near-term voting configurations
• Latent sensor failure sensitivity for the ARCS near-term application
• ARCS cross-channel communication link study
The results and conclusions from these studies, except the cross-channel communication
study, are presented below. (Appendix I contains the cross-channel communication study.)
It is felt that the numbers predicted are conservative since the coverage parameters assumed
for the various system modules (app. H) reflect what presently is generally obtainable
without resorting to sophisticated monitoring schemes.
CWS Mode Results.—The data presented below are based on the reliability models and
parameters of appendix H. Transient failure rates, mainly sensor nuisance failures, were
assumed to be the same for the ARCS near-term application and WWCS since these systems
have identical sensors. In table 6, CWS mode failure probabilities are listed for the WWCS
I Table 6.—Near- and Intermediate-term CWS Failure Probabilities
Stated ARCS goal: ' A system failure probability of less than 10
I for a 1-hour flight
,-5'
Mission
time,
hours
1
i
2
3 !
4 '
i
5 ' !
: 6 - i
i
; 7 ':
t '
i 8 . :
. 9
;10 _;
WWCS
2.7 x 10~6
1.1 x 10~5
2.4 x 10~5
4.3 x 10~5
6.7 x 10~5
9.6 x 10'5
1.3x 10~4
1.7x 10'4
2.2 x 10'4
2.7 x 10'4
ARCS
near
term
5.4 x 10'7
2.2 x 10'6
4.9 x 10'6
8.6 x 10'6
1.4 x 10~5
2.0 x 10"5
2.7 x 10'5
3.5 x 10'5
4.4 x 10'5
5.4 x 10~5
ARCS
intermediate
term
1.7 x 10~7
6.9 x 10~7
1.5 x 10'6
2.8 x 10'6
4.3 x 10'6
6.2 x 10~6
8.4 x 10~6
1.1 x 10'5
1.4x 10'5
1.7x 10~5
* 10~5 = "remote" event (FAA)
150
system and the ARCS near- and intermediate-term systems. A failure probability of
lO"-* for a 1-hour mission was established as a goal in the ARCS requirement specification.
This number is associated with the FAA definition of an improbable event. All failure
probabilities were computed assuming a failure-free system at time zero.
From table 6 the conclusion is drawn that all three systems satisfy the stated ARCS goal
of a failure probability of less than 10'-* for a 1-hour flight.
The ARCS near-term system exhibits a five-fold failure probability improvement relative
to the WWCS. This improvement results mainly from the ARCS ability to survive most
two like failures. Advanced sensors in the ARCS intermediate-term system reduce the
failure probability by another factor of 3. The assumption was made that all stages
except the R/A and ILS were required for CWS operation.
Auto land Mode Results.—Autoland failure probabilities were generated for the WWCS,
the ARCS near-term, and the ARCS intermediate-term applications using tlie system models
presented above. For the purpose of reliability analysis, a landing was separated into two
phases: a 15-second landing phase followed by a 30-second rollout phase. These phases
were assumed to require different hardware. The 15-second landing phase was assumed to
use all modules except the DG, compass, and compass coupler in the near-term ARCS
and WWCS applications, but all modules were required for ARCS intermediate-term applica-
tion. Furthermore, it was assumed that the rollout phase requires only those modules
necessary for the directional control of the aircraft. For the WWCS and the ARCS near-
term application, the compass coupler, R/A, normal and longitudinal accelerometers,
DG, compass, and air data computers were eliminated from consideration during the
rollout phase. In the intermediate application, the R/A, normal and longitudinal accelerom-
eters, one gyro, and pitot/static source were eliminated during the rollout phase.
The resulting autoland failure probabilities are as follows:
WWCS ARCS near term ARCS intermediate term
18 x 10'11 1.7 x 10'11 0.3 x 10'11
The numbers are based on the assumption that all modules are functional at Cat III alert
height (50 feet). According to the regulations, a hazardous event during landing must be
"extremely improbable," which has been interpreted to mean having a probability of
occurrence less than 10~°/landing. It may be shown that this figure is consistent with a
failure probability requirement of 10 /landing (sec. 4.1). The conclusion may then be drawn
that all three systems considered satisfy the requirement by a wide margin, provided all
modules are functional at minimum decision height.
The requirement that all system modules are to be functional at minimum decision height
imposes a severe constraint on the availability of the autoland function since the MTBF of
a redundant system of this complexity is quite low. The MTBF is 125 hours for the WWCS
and 137 hours for the ARCS near-term application.
151
Requiring that all modules be functional places them, from a reliability point of view,
in series, so that the probability of having no failures may be estimated by:
p (no failure at t) = e"^
with X = 1/125 for the WWCS and X = 1/137 for the ARCS near-term application.
Table 7 displays the probability of no module failure as a function of time. The very
rapidly decreasing functional readiness of the Cat III autoland function is a definite
disadvantage of the WWCS system as well as of any complex redundant system that requires
all modules to be operational at the beginning of a certain critical phase of the mission.
j Table 7. —Probability of No Module Failure
Time,
i hours
! 20
40
60
80
100.
WWCS "•
0.85
0.73
0.62
0.52
0.45
ARCS
near term
0.86
0.75
0.64
0.56
0.48
The ability of the ARCS to survive the vast majority of all combinations of two failures
suggests the possibility of permitting a Cat III landing even if a module is failed in the system.
The implication of this possibility will be demonstrated next.
It was pointed out earlier that the functional readiness (and functional availability) of a
redundant Cat III autoland system may be improved by permitting landing with degraded
redundancy. However, this has to be accomplished without violating the system failure
probability requirement of 10 /landing. ARCS near- and intermediate-term systems were
analyzed assuming the following six selected functional readiness criteria:
1. No module failed
2. No module or any one sensor failed
3. No module or any one computer failed
4. No module or any one servo failed
5. No module or any one sensor or servo failed
6. No module or any one sensor, computer, or servo failed
152''
Of the six criteria, only the first will be acceptable for the WWCS, since this system by
definition always fails upon a second like failure.
Autoland availability and failure probability are displayed in tables 8 and 9 for the near-
and intermediate-term ARCS applications for scheduled maintenance intervals ranging
from T = 20 hours to T = 100 hours.
' Since CARSRA computes functional readiness which is a system property, rather than
i availability which depends on the maintenance strategy, the availability numbers of tables
! 8 and 9 had to be evaluated by hand calculations using the formulas derived below.
: Letting Rj be the reliability of a module in stage i, the system availability, assuming a
; scheduled maintenance interval T, is with all stages surviving:
AV = AV =Q  f
J 0
which with 32Xj = Xg becomes:
AVQ = CT)
The availability corresponding to having all stages operational or one module failed in
. stage j is AV =
AVJ =
+ AV: where AV: is given by:
l-R:)dt
.3AV o
This formula is easily generalized to availability criteria corresponding to one module
failed in several stages:
AV =
AV = AVn + 2^4 ' 7^- -3AV0
} where the summations are over stages with a failed module.
J The results show that the availability of the Cat III autoland function may be improved
very significantly by permitting landings with a degraded system.
153
Iil&a1;fc
.
I
 U
;
 (
!l; 2?:ijl. ^:Ti OC)!i8
illl
u
.
 M
u
x
n
V
t
 U
IU
IU
I
U
Z
 z
 z
-JO
O
O
£
 
«
S
ec
 cc
O
O
s
 
c
o
«5
o
 
<
ui
 
u
.
;
 g
AVERA
FAILU
A
ANC
L
MAI
I N
X
 
X
 
X
 
X
 
X
e
n
 
;r
 
e
n
 
^
 
u
-
»
-
f^
 
f^
 
r
^
 
r*
.
 
r
^
S
 S
 S
 S
 S
x
 
x
 
X
 
X
 
x
CM
 
tr
*
 
1^1
 f-^
 
an
.
c
"
 
<
n
 
co
 
00
ir»
 
o
en
 
en0
 
C
7
 
0
 
C
*
X
 
X
 
X
 
X
fN
J
.
S
CO
 
-3
"
 
C
T1
 
trt
o
o
 
o
o
 
r^
 
rv
.
S
^
 
(
-
 
c
?
 S
a
-
 
1
0
 0
6
 
"
•
-
•
 
O
CO
 
'Z
.
00
 
'
—
U
J
C£
.
>
-
i^n
o
-
U
J
o
:ii*>8•-4t~Jm
*
 E;
 ;
tn
 
t
-
O
 V
il-i
13I 2; 3
•c
ge
e
 t
-
—
 z
 r
 o
c
<
 u
io
 u
i
O
J
 JC
 3C
 Z
(JO
O
O
ill;
s<1?tuui
i!!S ec.
 ac
O
O
v
au
§
 s
5
 
u
i
o
o
r
z
z
o
 <U
l
<
ui
 
a
:
j
 
<
 
3
M
 
O
3
 
U
l
"
*
a«
g
 
*
I
 
I
U
l
 
W
>
3
 
U
l
i
 
g
V
z
o
<
1i
HAINTENANCE
INTERVAL
t
u
tu
i
 
~
*
S32
s
;s
ABILITY
|E
a
:
 
-J
 mS<<e
ui
 u
i
 ^
ll|«" a
ABILITY
;t
U
Jbl
 
—
S
53
a
:
 
_
i»
«
"
•
«
•LABILITY
<t:
U
J
U
I
 
—
g3S15
AVERAGE
FAILURE
PROBABILITY.ABILITY
|
*rtat
J
0
 
0
 
0
 
O
 O
*
^
 
*
*
^
 
^
 
J
P
 3
1
Cs|
 
7
 
tO
 
K
b
 G
O
a
 s
 
-
 
«
c
>
 
c
^
 o
 
o
 
o
s
 g;
 a
 a
 s
g
 
-
 
B
 S
 8
O
 
C
D
 C
3
 
C
J
 
O
S
 
S
 
*
°
 
hs
"
 R
c*»
 
cn
 
o
o
 
o
o
 
o
o
O
 
O
 
C
D
 C
D
 
O
C
S
I
 
if
 
L
T
\
 
r
*
.
 
Q
^
 
^
 
^
 
^
 
0
CD
 
CD
 
CD
 
CD
 
O
CD
 
C
D
 
O
 
C
D
 
^3
g
 
S
 S
 
<
"
 »
^H
 
r-t
 
-H
 
•
-<
 
i-1
C
D
 
C
D
 
O
 
O
 
C
D
X
 
X
 
X
 
X
 
X
CM
 
CW
 
CSI
 
CNJ
 
CS
*
tr^
,
 
t*\
 
f^
\
 
»
*\
 
f(
S
C
5
 
W
»
 
r-I
 
to
C
n
 
o
o
 
o
o
 
t"
*
S
 g
 S
 S
 §
154
The system failure probability constraint is satisfied in the near-term system for all
considered functional readiness criteria provided the maintenance interval does not exceed
TM = 40 hours. Longer maintenance intervals may be tolerated by requiring all computers
to be operational at alert height. However, this will decrease the availability. For the
intermediate-term system, all considered functional readiness criteria will result in acceptably
low system failure probability for maintenance intervals not exceeding T^j = 100 hours.
The availability improvements realizable by an ARCS-type system, compared to a TMR
system like the WWCS, could be a decisive factor in a commercial STOL application where
every STOL landing will have safety requirements similar to those of a Cat III automatic
landing.
CCV Fly-By-Wire Mode Results.—A. controls configured vehicle (CCV) application, where a
failure of the control system will result in loss of the aircraft, was assumed for the ARCS
far-term system. A requirement that the system failure probability not exceed 10"° for
a 1-hour flight and 10"" for a 10-hour flight was postulated in the ARCS requirements.
The baseline far-term system employs quadruple redundancy in all stages except in stages
with redundant functions, such as the upper and lower rudder servos and the aileron and
spoiler servos, for which triple redundancy was assumed.
In the operational model for the far-term application, it was assumed that a diversion will
be made if two like module failures occur during a mission. The duration of a diversion
is assumed to be no longer than 30 minutes. The probability of a diversion and the probability
of system failure are presented in table 10.
Table 10.—ARCS Baseline Far-Term Results
Scheduled
mission time,
hours
1
2
3
4
5
6 |
7
8
9
10
System failure
probability
0.13 x 10~9
0.52
0.98
2.07
3.22
4.61
6.23
8.09
10.20
12.50
Diversion
probability
0.15 x 10~5
0.61
1.13
2.40
3.73
5.34
7.22
9.30
11.80
14.50
155
The results of table 10 are based on current (1975) failure rates. The conclusion may therefore
be drawn that a quadruple control system using ARCS technology will be adequate for a
future CCV application. However, it should be pointed out that the reliability model assumes
perfect coverage for any combination of two failures (and the vast majority of third
failures). The analysis also indicated that the far-term requirements could not be met by a
conventional fail-op/fail-op/fail-passive quadruple system.
Voting Node Study Results.—The objective of the voting node study was to compare four
different voting configurations with respect to their reliability and feasibility of implementa-
tion. The configurations considered are shown in figure 64.
The "brickwall" configuration is characterized by a single voting node located at the second-
ary actuator output (force voting) but is otherwise identical to the baseline ARCS. Each
channel uses the same sensor information exchanged via the cross-channel links but
does not vote the sensor signals. Failure of any module (sensor, computer, or servo) in
a channel will cause loss of the whole channel function. Since each computer has access
to the same data in the brickwall configuration (A) as in the baseline configuration (B),
parameters like second-failure coverages and transient leakages will.be assumed identical.
This also applies to the other two configurations. The result of the voting node study
will therefore be to isolate the effect of different voting strategies.
Configuration C is the baseline ARCS with servo command voting added. The servo command
signals are crossfed to servo interface modules in each channel, which perform signal selections.
The extra hardware required to implement this feature is not insignificant if the ability
to degrade from triplex to duplex to simplex servo redundancy is to be preserved. The
assumption is made that the near- and intermediate-term systems use an analog voting
scheme, while an implementation based on microprocessor technology appears to be a more
realistic assumption for the far-term system because of the many servo functions.
Configuration D is C with the addition of cross-strapped sensor signals. In this configura-
tion, each computer will have triplicated sensor interfaces. The number of success paths
is increased in configuration D since a computer failure does not imply loss of all sensor
functions in the channel.
The result of the voting node study is presented in table 11. Several conclusions may be
drawn from this table.
1. A large improvement (factor of 5) is obtained by voting the sensor signals.
2. The additional complication of voting the servo command signals is not justified.
3. Cross-strapping the sensor signals is a good idea unless the disadvantage of the
additional interface hardware and wiring makes it unattractive. The integrated strap-
down air data system (ISADS) used for intermediate- and far-term applications appears
to b.e better suited to sensor cross-strapping than the conventional system used
in the near-term application.
4. Voting is more effective in a quadruplex system than in a triplex system.
156
A BRICKWALL (ARCS WITHOUT SENSOR VOTING)
[SENSORS w COMP, SERVOS
FORCE
VOTE
B SENSOR VOTE USING CROSS-CHANNEL LINKS (ARCS BASELINE)
PLUS SERVO COMMAND VOTING WITH MODE SWITCH
SENSOR DATA CROSS-STRAPPING AND SERVO COMMAND VOTING
WITH MODE SWITCH
Figure 64.-Voting Trade Study Configurations
157
\c.o£5.2I•8o.'•8•8CbiS
4-
*
!5CO.0SQ
.
22
 
"c
2
 
.2
42
 
.S2
c
 E
o
 
<
-
S
°
'
 "E! 3
i
 
0
)
i
 
.E
'
 4
-
"
O
T3
"
c
!
 
ro
 
°>i
i
 
8
 S
 |
Q
 
C
 
>
 1
|o
:
.±
 
0)
I
 
"
"T
O
!
 
C
J
 
'4-*o
1
 
>
io
 §
!
 
W
i
 
.C
i
 
>
1
 
.
 
.
.
-
 
.
.
'
 
C
O0
1
 
DC
1
 CD
 
0
>
j
 
C
'
 
s
>
 
CO
_
.
.
=
1
 
"Iu'CCO
co
Jl' a. ^<f 
'
r^bXCM
'
i
OXCOin
.
_
.
.
.
.
 bX
i
 ^j
~
'
 
idCO
!
 
b
i
 
XCD
!
 oi
ij
 
E"
1
 
<O
i
 
.Sj
00bXCOOJbXCO
•
 
-
 
-
 
-
bXi^
.
'
 
T
-
'
CObXCO^(U•4-*O)+-*(Q1!E<uc
^LT—bXCOobXCO
bbXCD«-
'
O
)bXCD"'EBb.. cau_
158
Sensitivity Study Results.— The, sensitivity study hadthree objectives: (1 ) to assess ____ 1 _ T
the sensitivity of the near-term system failure-probability to variations in Markov model —
transition rates for the different stages, (2) to assess the relative impact of computer —
unreliability on system failure probability, and -(3) to assess the significance of -latent- — --
sensor failures. . . . . . _ . ------- - — . . _
An example of a Markov model for a stage was shown in figure 62. The parameters AJ:
in the figure are the transition rates. These rates and the structure of the Markov model will,
together with the dependency between stages, completely specify the system failure proba-
bility. The sensitivity of this probability to variations in the rates Xj.- could be studied,
in principle, by making repeated computer runs with slightly altered values for Xj= .
However, this approach would be practical only in a limited study involving a few
specific, selected rates.
Another possibility would be to derive an analytical expression for the system failure
probability as a function of the rates Xjj. This-possibility was explored. An approximate
method for evaluating Markov state probabilities was developed. This method was useful
not only for generating sensitivity data, but also for checking CARSRA program outputs
for gross errors. In its simplest form, the approach finds the term of lowest order in t in
the Taylor expansion for the Markov state probabilities. This term provides a good
estimate of the actual state probability if Xyt « 1 for all AJ-. The essence of the approach
will now be outlined.
Consider the simple Markov model of a triplex stage shown in figure 62. The Markov
equation for state 2 is: .
IP? = - (X23 + X24) lp'2 + X12 IPJ-
If for t = 0, p j = 1 , and p2 = 0, we will have pj » P2 for small t. Therefore
p 2 «x 1 2 P l
which after integration becomes
Similarly, for state 3:
p3 = -X3 4p3 + X2 3P2
which, if p3 «p2 leads to
t2p3= X ] 2 X 2 3 -^
_ ...! 159.
Finally, for state 4:
P4 = A24
so that
t2
P4
t2
The following observations are made from these results. State 2 is reached after one
transition (from state 1 to state 2) and the approximate probability for this state is the
product of the one transition rate Xj2 and t tojhe power of one.. State 3',was reached
after two transitions, and the probability was formed by multiplying the two transition
rates in the path leading from state 1 .to state 3 with each other and then multiplying
by t^/2. These observations are special cases of the following rule:
The probability of an arbitrary state may be approximated by summation of the
contributions from paths leading to the state starting at the source state (state 1
in the example). The contribution from each path is formed by multiplying
together the transition rates in the path and then multiplying by (tn/n !) where
n is the number of transitions in the path.
This simple rule makes it possible to find approximate failure probabilities for rather
complex systems by hand calculation. As ah example, an approximate prediction of the
probability of a detected failure in the ARCS near-term system ca'n be made as follows.
The various stages in the ARGS near-term application are listed in table 12 together with
a stage number assignment that will be used for this example. The dependency structure
of the ARCS near-term system was defined in figure 59. The probability of a detected
system failure may be estimated by considering all combinations of two module failures
leading to system failure, i.e., all two-step transitions in a comprehensive Markov model
that end up in the detected failure state. It is recognized that an error will be made by
ignoring combinations of three or more failure combinations equivalent to system failure.
This error will.be. small, .however, if X t « 1 for all transition rates.
All two-step transitions leading to system failure are listed in table 13. The (stage) location
of the first transition (which is from the first to the second state in the stage Markov model
of fig. 62) is indicated in the first column. Stages for which a secondjransition (between
states 2 and 4 of the stage Markov model) willresult^ in system failure are indicated in tfie
second column. Note that stages 6 and_.7_ are not .required for the CWS mode.. The system
failure probability, may be estimated by. summing over alLtwo-transition contributions,
A12 A24 * /2, where i and j pertain to stage numbers^in columns 1 and 2 respectively.
160,
- L.-
FP(t) =
18
2
i=7
Evaluating this experssion using the failure rates defined in figure 59 yields
FP(t) = 0.547 • 10'6 • t2
where t is the exposure time in hours. Good agreement is shown with the second column
of table 6.
Table 12.—Definition of Stage Numbers for the
ARCS Near-Term Application
I
\ Stage
Processor and memory
Multiplex and A/D
Watchdog and multiplex D/A
i Compass coupler
JR/A
ILS
Yaw rate
Lateral accelerometer
Normal accelerometer
Longitudinal accelerometer
DG
Compass
VG
Control force sensor
Air data computer
Roll servo
Pitch servo
Yaw servo
Hydraulic supply
\ Stage no.
i 1i
2
3
4
i 5
! 6
{ 7
i 8
I 9
I 10 ;
J
i 11i
' 12
; 13 ,
! 14
' 15
16
| 17
1
 18 .
19
Table 13.—Two-Step Failure Transitions Leading to
System Failure for the ARCS Near-Term
CWS Mode Application
First failure
transition
(X12)
in stage
1
I 2
i 3
4
5
6
7
8 :
9 !
10
11
12
i 13 |
• 14 !
1-5
16 !
17
18
, 19 j
, Second failure transition
) leading to system failure (X1^)
in stages
1,2,3,4,7,8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18
! 2,4, 7, 8, 9, 10, 11, 12, 13, 14, 16,17, 18
3, 16, 17, 18
4,11,12
5
6
7
8
9
10
11
12
13
14
15
16
17
18
16,17,18,19
The method outlined was used to establish and approximate an analytical expression for
the probability of a detected failure for the ARCS near-term system using all stages. It
was seen above that this expression is a second-order polynomial in t with terms of the
type Xj2^k) ' X24*J) • t^/2, where k and j refer to separate stages.
The desired sensitivity factors could then be derived by differentiations. Two types of
sensitivity factors were used:
1 . Transition rate sensitivity, defined by
5(FP) 100
FP
The factor S^' expresses the percent change in the failure probability caused by
a change of AA.ij(k) = 1 /hour where Xy(k) is the transition rate-from state i to state j
in stage k. Thus we have
AFP% = SjjCk) AX«00 %
162
2. Rate ratio^sensitivity, RJJ > defined byr
R..OO
 = S IFF) 100ij
A variation, Ar^'k', will cause a failure probability change of AFP %:
AFP% = Ry • Ary %
V . .The rate ratio sensitivity factor is mainly of interest when assessing the sensitivity
to second-failure coverage c = ^ 3 = 1 - ^ 4 in the various stages.
The results of the sensitivity analysis are summarized in table 14, where the factors
S j 2 > $24, and R24 arepresented for the various stages and the voting configurations B,
C, and D of figure 64.
Several interesting observations may be made ^regarding table 14. Looking.at the sensitivities
S j 2, for example, we..find that the baseline ARCS is very sensitive to computer transients
that cause a redundancy degradation from triplex to duplex. Assuming, for example,
that the computer would be unable to recover- from 1-% -of- the abnormal electrical power
transients occurring with a rate of 1 0~2/hour,-the resulting increase in failure probability
becomes:
AFP% -= 5.4 x 104~- 3 x lO'2 • 10'2 = .16% ~
With sensor and servo cross-strapping, on the other hand, the sensitivity to leaky computer
transients is much smaller:
AFP% = 0.78 x 10 4 -3 x 10'2 • 10'2 = 2 % .
The reason for this difference is of course that a loss of a computer in the baseline ARCS
causes degradation to duplex in all stages.
~For the~baseline ARCS, second-failure coverage sensitivities~(R24) are most significant
~ for the^R/A, ILS,"DG, and" servo stages7whereas"a^high^computer coverage's most 'important
"in the configuration with "cros's-strapped^sensof ancl servo signals.
IHs of interes^ toj"indj:hat proportion of system failure probability attributable to computer
unreliability. This can be done by gettingjthe transition rates for the computer stages be^
zero in the CARSRA program. Jhe result is presented in table 15.
-The dramatic decrease when introducing sensor crossrstrapping may be explained_by_the
fact that the bulk-of system^unreliability-is associated-with-uncovered sensor-failures
The-impact of a computer failure will therefore be very significant if it-has-the effect of
reducing all-sensor redundancy from triplex to duplex. —
,163 —
1-1/2
 A
 11
 INCH
 UKUKS
I.§Ii.8B
O
>
Q
.
Q
.
CO
o)->3
c
 to
•M
 
CO
O
 
<
"
(/}
 
*
*
*
s
i
 
T3C
!
 
.
 
to
,
 
.
.
1
 c
 •'
;
 
V
*
i
 
_
_
C«j0
i
 
og' CO.'co( <r<!' Q). CCO
—
'
 g• V, c/ 1, c- a<
_
 
.
9
 
^
ioTI""*
!
 
C
M
i
 *
"CM
i
 CO
j
 
^c
i
 
<tf
i
 
^
!
 C
M
ICO
i
 
•
*
!
 C
M
i
 
<
»
j
 
CM
,
 CO
ft
 
•
!
 c
o
•
1)
.l
i
 
0
1
 
°
ino
i
 
X
i
 C
O'-—01ECOCM
ino*
 —X00
i
 
«
~•*
i
 E1 CO
-
i
 
m
\
 in
i
 
oX<°
.
=
=
(0FCO
i1
 
<
o(~
-
*
—
°0^X^^'r.
.
en«~
CMOXCOoCD,_
™
0XCOinCO
o
o
 
o
CM
 
in
in
 
o
o
 
o00
o
 
c
o
*
"
CO
CM
 
T
-
CM«-
 
O
)
«
-
 
O
in
 
••a-'
CO
O
 
i
-
CM..
 
.
 
.
.
.
CM
o
 
c
o
«
-
 
0«
—
co
 
' t
<u
CO!2
j Yaw rate
j Acceleromete
moCMCM0CMO-,_CMCM*~5
.
O
>
^o>CJ>
e>o
CMCM«
-
coCMCOCO—COCM2^>rMCM• Compass
00r»
-
o^r«~SCO"-COS«—^ininO
o10
 S
«
-
 
in
o
 
«
-
en
 
«
-
co
 
inCM
CM^*
 
00CO
inen
 
i
-
O
 
t
-
CM
 
CO
"fS."
co
 
oCO
inCO
 
O
O
 
t
-
co
 
in
«
-
 
C
O
' Control force
, DADS/pitot
COCM«
-
CMCMCOCOCOoCNenCM
—I
—
.^
*
~
'100)COo0>CO
CM«*CMCOingoCMoinoin—enooin>*o|| Processor/me
—00COo•«
-
00•*inin*-CMCM?CO^8CM_0>O1 Compass cou
164
Table 15.—Contribution of Computer Unreliability to
System Failure Probability
i '. .„ „„ „ „„„,. :_
1 ARCS near-
| term configuration
i ARCS baseline
I With servo voting
}
j With sensor cross-strapping
\ and servo voting
i Contribution,;
| percent
J 5 7
j 55
j
j 12
Latent Sensor Failure Study Results.—The objective of this study was to evaluate to what
extent latent sensor failures will impact the reliability results, while at the same time
demonstrating the capability of CARSRA to handle a more complicated stage model.
Latent sensor failures are mainly passive failures of signals that nominally are close to zero.
These types of failures may go undetected for some time, particularly during quiescent
flight conditions, before they are detected. Figure 65 illustrates latent sensor failures.
The seven-state Markov model depicted in figure 66 was used to describe latent failure
effects. States 1,3,5, and 6 model detected failure states while states, 2, 4, and 7 model
latent (undetected) failures. State 2 corresponds to.one latent failure, state 4 to one latent
and one permanent failure, and state 7 to an undetected stage failure. The terms X j 2 >
^27' \34> anc* ^-57 are I8*611* failure transition rates; X47.consists of both latent and detected
failures; the remaining rates except \23 a"d ^45 are associated with detected failures; and
and \45 model the latency detection rate, which describes the process of detecting a latent
failure by experiencing a deviation between a good signal and the failed one large enough to
trigger the threshold detector.
To make a just comparison between the models with and without latent failures, the transi-
tion rates between the different failure levels were made identical. This implies that the two
models will always yield the same total probability of a system failure; only the proportion
between detected and undetected failures will differ.
Latent failure rates for the sensors were assessed based on sensor failure mode data. (The
data sources are discussed in app. H.) Latency detection rates were assessed based on flight
records of sensor characteristics during quiescent conditions. A latency detection rate of
0.1/hour was assessed for the majority of the sensors.
Results of the sensor latency study indicate that the .estimated probability of an undetected
failure increases substantially when taking into account the possibility of latent failures.
The estimated probability of an undetected failure for the near-term CWS application, for
example, was increased from 1.3 x 10~10 to 1.2 x 10"^ for a 1-hour flight. However, this
probability is still small compared to the probability of a detected failure (5.4 x 10'^).
I 165
UJ
s
1'
 \\\11'y1
 
X
-
x
j
 
\
7
-»
^J^^
^
tI
w
y1
^
J
\u>U)_l
Q-JQ
C
O
O
X
 1
-
CO
 <_>
LU
 LLJ
•^
 1
-
\
 
X
 U
J
-
\I-
Q
»
~
0CACQj
f
 s\
\S},^Ci
53C^j
^J11
usCOQJ2:3.»>
166!
NO FAILURE
ONE FAILURE - -
TWO FAILURES - -
STAGE FAILURE- -
ONE LATENT
FAILURE
ONE LATENT AND
ONE PERMANENT
FAILURE
DETECTED UNDETECTED
Figure 66.—Latent Sensor Failure Model
167
6.1.2.4 Reliability Analysis Conclusions . , _ -,
The two main aspects of the ARCS reliability task were to identify a reliability estimation
method applicable to the analysis of reconfigurable fault-tolerant systems and to use this
tool to estimate the reliabilities of the WWCS and several ARCS candidate~systems. - ~
In surveying available reliability estimation tools, it was found that a new approach was
needed. Two new reliability estimation programs were developed: RESRA (Redundant System
Reliability Analysis) and CARSRA (Computer Aided Redundant System Reliability
Analysis). RESRA, which is a forerunner to CARSRA, does not have the same capability
as CARSRA but is simpler to use; i.e., it requires less input data and has.a shorter processing
time.
CARSRA is designed to handle modular-redundant systems that have internal signal
consolidation points. It will-take into account failure coverage and is able to model the
effect of transient faults. It will also model a wide variety of redundancy degradation strate.-;
gies , including spares augmented systems; and computes functional readiness as well as
system failure probability. CARSRA was'used extensively to generate the ARCS reliability
study results and" proved to be a powerful as well as flexible tool particularly suited for
configuration trade studies. Its unique capability to compute functional readiness and
failure probability greatly facilitated the generation of the data base for the reliability
analysis. Appendix F contains a further description of the principle of operation and a
user's guide.
The following conclusions were drawn relative to theJWWCS and ARCS baseline configurar!..
tions:
• The WWCS, ARCS near-term, and ARCS intermediate-term systems will all provide
adequate control wheel steering mode reliability. However, the ARCS near-term
application will exhibit 5 times lower, and the intermediate-term system 15" times
lower, failure probability than the WWCS.
• The WWCS, near-term ARCS, and intermediate-term ARCS will also have acceptably
low failure probabilities in the Cat III autoland operational mode provided all
modules work at alert height.
• The functional readiness of the. WWCS Cat.IILautoland function is 85%.after 20 hours-.
operation and 45% after 100 hours operation without maintenance. This poor ."
performance results from the requirement that all modules-in the system-have to be
operational at alert height. —
• For the near-term ARCS, the functional readiness may be improved by permitting
Cat III landings with one module failed in the system. The resulting functional
readiness for the near-term ARCS Cat III autoland function is 97% after 20 hours opera-
tion and 89% after 100 hours operation without mamferiahceTlhcreased functional
readiness is the most important advantage of trie ARCS relative to the WWCS.
168 I-'
• The far-term ARCS is a quadruple system designed for CCV operation. Reliab.ilityZ-_ZIJ
analysis indicates that a quadruple ARCS would satisfy the postulated failure proba- —
bility requirements.
The ARCS trade study results are summarized as follows.
• Results of the voting node trade study indicate that large reliability improvement
is realized by voting the sensor signals but that servo command voting only will
result in a small improvement. Also, a system in which each computer directly interfaces
with all system sensors via triplicated channel interfaces will exhibit lower failure
probability than a system in which sensor data is exchanged via the processor cross-
channel links. This added reliability, however, has to be weighed against the additional
hardware, which increases the packaging size and complexity and decreases the system
MTBF.
• Results of the sensitivity study show-that- the computer reliability is a dominant
factor in configurations where sensor data is transferred via the processor cross
channel but is less important if the sensor signals are cross-strapped so that each computer
interfaces with all system sensors. The study also shows that the sensors dominate
the system reliability, partly because of a high cumulative failure rate (68% of the total
system failure.rate in the near-term ARCS) and partly because of the conservatively
assumed low second-failure coverages.
The main conclusion of the reliability study was that the ARCS capability of redundancy
degradation from triplex to duplex to simplex opens up the possibility of initiating a
critical flight segment, e.g., a Cat III landing or a STOL takeoff or landing, with a partially
degraded system. System availability will thus be dramatically enhanced with the<ARCS
compared to a TMR system like the WWCS. The observation can then be made'that
redundancy in the ARCS can be used not only for the purpose of decreasing system failure
probability but also as a means of increasing functional availability where degradation to
simplex is an acceptable operation.
The observation may also be made for the autoland function that the increased sur-.
vivability obtained by the ARCS relative to the WWCS does not by itself justify the ARCS
since the reliability data presented shows that the WWCS is adequate from a safety
point of view. The advantage of ARCS is therefore in the area of improved economy.
6.2 COST/BENEFIT ANALYSIS
An assessment of the overall cost effectiveness of the ARCS was carried out in two phases:-
an analysis of airline cost-of-ownership for an ARCS maintained in a Category III operational
status and an analysis of the cost effect of providing an integrated-system test-function
to augment system maintainability. In general, an avionic system's cost-of-ownership is
comprised of three major factors as depicted in figure 67: acquisition cost, maintenance
cost, and cost of schedule interruptions caused by thefloss of a~pafticular operational
capability, e.g., loss of Catll/Cat III approach functions in conjunction with poor weather
conditions.
CO
 <_>
0
>
.•-i
 a:
i-iu
<r
 to
to
 
_iu
-
>
-
 UJ
 O
<
u
-J
Z
K
-
U
J
O
Q
O
O
I
 
I
 I
52au•2.0
170
^
The cost data for the first analysis was compiled by comparing the ARCS td'the WW.CS~_ " _.'
which represents a contemporary triple-modular-redundant (fail-op/fail-passive) system
design. The baseline for the second study was the-ARCS hardware without the-maintenance
enhancement elements, maintained in a conventional manner. — —
The data base was supplied by General Electric and United Airlines. GE provided data
relative to the ARCS and WWCS acquisition costs, and United provided cost data relative to
an airline's operation as influenced by weather conditions and line and shop maintenance
procedures.
An operations model based on the United Airlines 727 operations was selected as a repre-
sentative framework for assessing cost trends associated with the ARCS technology. The 727
has a route structure that is representative of an average airline having medium trip lengths
and is common to many .operators. It .serves a_wide variety of airports throughout the United
States including stations with maintenance capability ranging from complete to-none. The
characteristics of the UA 727 fleet operation of interest for cost-of-ownership computations
are summarized below.
Fleet size 150 aircraft
Annual flying time 350 000 hours
Average flight duration 1.3 hours
Airports served 77
Maintenance stations 21
The following discussion is divided into four sections: .an analysis of the cost effectiveness
of ARCS attributable to its enhanced functional reliability/availability, an evaluation of
ARCS acquisition cost, an analysis of the cost effect of providing an integrated system test
function, and a summary.
6.2.1 ARCS AVAILABILITY COST EFFECT
The baseline airplane system on which the cost comparisons were made is the ARCS inter-
mediate-term application model. This rno'del reflects a basic command augmentation system,
or CWS, and a flight-mandated (required by the operator for dispatch) Cat III autoland
capability.
Because the WWCS is a triple-mod^lar^redundant system, any failure within thejhree
channels (be it sensor,.computer, .o.r_ servo)degrades .the system below Cat II/IILrequire-
ments, and it must be repaired prior .to a dispatch meeting.the operator's revenue operating
policy. The ARCS, on the other hand, can be dispatched meeting Cat II/III requirements -
with a fault, thus deferring system maintenance. The significance of the above baseline
considerations were not completely quantified in the-study,-but are qualitatively brought
out, where appropriate, in the cost-related discussions that follow.
6.2.1.1 Maintenance Cost ""
In evaluating the maintenance cost of the ARCS relative to the WWCS, no consideration was
given to the effects of system test. System test effects were dealt with separatejy and are
,171
5 :
i V. W i i 11 v i r i
assumed to be realizable with either-the ARGS or WWCS. Therefore, the primary attribute
of the ARGS that makes it more attractive from-the standpoint of system-maintenance is
its ability to-continue to operate in a simplex mode. With this capability,"the occurrence"
of any single failure in any stage will not significantly ".hamper the fault-tolerant characteris-
tics of the system, and hence immediate repair is not required. ~
This characteristic.has several maintenance implications. The number of line stations with
system maintenance capability can be reduced since the airplane can continue operations
with an existing failure until reaching such a facility, e.g., overnight station. Fewer main-
tenance stations also implies correspondingly fewer line spares required on a fleet-wide
basis. And, since repair of first-failure conditions need not necessarily be made before the
next revenue departure, the probability of incurring a maintenance-caused departure delay
or cancellation is greatly reduced.
Although viewed as having a potentially significant impact on system cost-o Ownership, the
identified maintenance cost improvement attributable to ARCS could not be creditably
quantified within the scope of this study. To evaluate the cost effects of modified main-
tenance procedures, a restructured maintenance facilities network, and projected occurrences
and durations of delays, would require a highly sophisticated operational simulation of the
selected airline model. Since such a simulation was beyond the scope of the ARCS program,
these cost-effect factors have been identified and discussed strictly subjectively.
6.2.1.2 Cost of Weather Interruptions
In assessing the number of weather-caused diversions potentially avoidable by using the
ARCS, consideration was given to three parameters: the number of scheduled arrivals into
airports with Category II or III ground facilities, the probability of encountering Category II
or lower weather conditions at these facilities, and the probability that the flight control
system is functionally capable of performing the autoland task at the time it is required.
This relationship can be expressed as follows:
Avoidable diversions = 2 (no. of flights to Cat II runway) •' P(Cat II) • A(Cat II) +
Z, (no. of flights to Cat III runway) • P(Cat III) • A(Cat III)
where A = availability.
Table 16 summarizes the-airports which7 according-to FAA planning,-are-seheduled to-include-
-Gat II and/or Cat-Ill ground facilities by 1980. Weather probabilities were extracted from
^United Airlines weather department records for the-last 14 years. This information-shows
.that, on a probabilistic basis, the fleet of aircraft defined by the airline operational model
will incur Category II weather conditions on 957"arrivals arid"Category" III"conditibns on
•1022 arrivals annually.
Using the defined airline operational model (providing for maintenance capability jit 21 of
its 77 destinations) and an average flight duration of 1.3 hours, and considering hardware
and functional reliabilities, the average availability is 0.999 for the ARCS and 0.964 for the
WWCS. Based on these availability parameters, the ARCS-equipped airplanes can.be expected
to incur approximately 33 fewer Category II and-36-fewer Category III weather-caused
Tl72 L-
Table 16.—Weather-Caused Interruptions
Category 1 1 Airports
Airport
ATL -
BAL
BDL
;
 BHM
BOS
i BUF
CLE
. CLT
DAY
DCA
DEN
; DTW
EWR
GEG
HSV
. IAD
JFK
LAX
LGA
MCI
MEM
MKE
MSP
MSY
OAK
ONT
ORD
PDX
PHL
PIT
PVD
RIC
ROC
SEA
SFO
SLC
SMF
TOL
TPA
TVS
P (Cat II)
0.012
0.005
0.006
0.001
; 0.006
0.003
0.004
0.008
0.004
0.003
: 0.002
: 0.002
' 0.003
0.008
0.001
0.008
0.003
0.005
0.007
0.002
0.002
0.007
0.002
0.005
0.002
—
0.008
0.005
0.005
0.004
0.008
0.005
0.003
0.005
0.003
0.001
0.007
0.004
0.004
0.010
Flights/day
, 15
, 9
:
 2
. 4
4
7
35
, 3
! 3
18
41
6
, 4
6
6
; 5
3
I 43
12
7
4
5
12
2
6
4
97
23
11
22
2
2
10
22
51
11
6
5
2
6
Totali
Diversions/
year
65.7
i 16.43
4.38
1.46
, 8.76
7.67
51.10
8.76
4.38
19.71
29.93
4.38
4.38
. 17.52
2.19
i 14.60
i 3.29
78.48
30.66
; 5.11
2.92
12.78
8.76
3.65
4.38
0
283.24
41.98
20.08
'• 32.12
5.84
3.65
10.95
40.15
55.85
4.02
15.33
7.30
2.92
21.9
956.71
Category 1 1 1 Airports
. Airport
; ATL
DTW
IAD
JFK
LAX
MCI
ORD
PDX -
SEA
SFO
P (Cat III)
0.009
0.004
0.012
. 0.011
0.019
0.002
0.005
0.01.8
0.021
0.007
Flights/day
15
6
5
3
43
7
97
23
22
51
, Diversions/
year
49.28
8.76
21.90
12.05
! 298.21
5.11
177.03
151.11
1.68.63
130.31
~~ - -^^^
Total 1022.39
(173
diversions than the WWCS-equipped airplanes. At $2100 per diversion (an average-amount—!
based on United's fleet experience), this amounts to an annual savings of-$-145 000.-
6.2.2 ACQUISITION COST
The first cost advantage of the ARCS, compared to the WWCS, is in the acquisition cost. """
A pricing exercise, based on a production run of 990 units (300 ship sets plus 10% spares) for
both the ARCS and an advanced WWCS (i.e., the WWCS architecture with current piece-part
technology), yielded the following unit costs:
ARCS computer
WWCS
• Computer unit
• Interface unit
$30 500/unit = $ 91 500/ship set
$21 000/unit
-$2-1-000/unit - -
$42 000/system= $126 000/ship set
The cost difference between the WWCS and ARCS is attributed to two factors: (T) the
additional hardware required to implement triple-modular-redundant system voters, monitors,
and other triplex-oriented special features and (2) the two-box nature of the WWCS. The
cost contribution of these two factors is estimated to be approximately equal.
The specified cost is for hardware only. Estimated software development costs are $1 million
for the ARCS and $850 000 for the WWCS. Amortized over 990 units, this results in an
incremental unit cost of approximately $ 1000 and $850, respectively.
6.2.3 SYSTEM TEST COST EFFECT
This section discusses the analysis of economic benefits atrributable to an integrated system
test capability. Though much operating cost data is currently available for contemporary
analog flight control systems, comparisons between such systems and the digital configuration
represented by the ARCS are not considered to be valid. The purpose of the analysis there-
fore was to determine the cost effectiveness of the implemented system test function on its
own merits. Maintenance of the ARCS hardware with conventional methods was compared
to maintenance of the same ARCS hardware using the built-in system test capability. The
intermediate-term application model with a flight-mandated-(required by-operator for
dispatch) Category III autoland^capability was-assumed^for-tm's-analysis:
The costs associated "with the daily operatiorfof an avionic system are either directly or
indirectly incurred as a result of the maintenance operation aifd are therefore functions of"
the system configuraton and the airline maintenance pro cess!" They^are corrTpTised~primarily
ofjine maintenance costs, shop maintenance cost^j^parmgjjxpenses, and delay and cancella-
Jjon costs attributable to the particular system. Line maintenance expenses are^primarily
man-hour labor costs and the associated overhead._Thej^js,_however, the.potential for a
delay cost if the maintenance activity_takes_excessive time.
'. Shop maintenance -costs accumulate from several sources: the direct man-hour labor-
associated with testing and repairing equipment-that has been removed-from-the aircraft;
:
 the material expense for repairing faulted units;-overhead expense incurred-as-a function of
direct labor; and automated test equipment (ATE) operating costs. -System tesHs-not • —
expected to influence either material or overhead-expenses. It will, however,-have a^ene^
ficial effect on labor and ATE test time. ~ -
Sparing expenses accumulate from two sources: the cost of acquiring required spare units
and the cost of holding the spares in inventory. The acquisition cost of spares is considered a
variable rather than fixed expense because the level of sparing is a function of the demand
rate and the demand rate is in turn a function of the maintenance operation. Inventory
cost is generally computed as a_percentage of the acquisition cost of the .units being spared.
Therefore, it is also a function of the maintenance operation. .. .
Where it is necessary to deal with specific sensor-systems, those assumed-for-the intermediate-
term application model were used-.-However,-when such parameters as sensor removal-rates
need to be defined, experience with the-latest generation of avionic equipment was used as a
basis for intermediate-term model projections^ The latest generation is that represented by
current DC-10 and"747 fleet operations.
The direct hourly labor rate assumed for all calculations was $8 per hour. Overhead rate for
line and shop maintenance operations was computed as 185% of direct labor.
The failure rate for the ARCS computer unit was estimated .to be 320 failures per million :
hours, equivalent to an.MTBF of 3125 hours. These estimates were derived from piece^part -
reliability projections using component failure-rates in accordance -with MIL-HDBK-217B.
Removal rates for ARCS computer units were assumed to be 1.05-times the failure rate with
system test capability and 2.0 times the failure rate-without. The former assumes achieve-
ment of the system test design goal of no more than 5% nuisance failure indications, and the
latter is indicative of current verification rate experience with contemporary flight control
systems, which ranges from 25% to 60%. Without built-in self-test capability, it is unlikely
that"a digital implementation of the flight control function will significantly alter the
current experience. A comparison of verification rates for analog and digital air data
computers supports this: the analog computer used on the 747 and the digital computer
used on the DC-10 exhibit verification rates of 32% and 30%, respectively.
.The.following analyses_diseuss_the.five_CQSt_categories_pf line_maintenance^shop_maintenance,_
spares provisioning,-sensor system-effects,-and^system_test_acquisition costin greater^detail.-. _.
6.2r3.1 Line Maintenance Cost
Line maintenance action is initiated by "the flightcrew generating a flight-log writeup of the~
malfunctioning system. "The"line maintenance'operation'rnust then locate~the failure" within
theTysteirfarid effect theliecessary repairs. A~typical line maintenance opefatiorf hasi been
defined and illustrated in figure 68. It^consists of Isolating th~e~failure toTspecifi'c" LRU,
removing the failed unit, replacing it with a spare from stock, and functionally checking the
newly installed unit. If the repair affects Cat II/III equipment, a test is made to verify the
basic functional integrity of the system.
'175_
PLI©Hi SQUAWK
T '
NvD /\ |§ rM.it'Eti umi
. il«fl:if» f'
GAfW|R ft Si
E©UrPMlNT
PERFORM MANUAL
iESi PROCEDURE
NO
YES
OBiAIN
RiPLAGIMENi
LRU * -
I
AND
i
FUNCiiONALLY
CHECK UNli
i
' CAT If QR III
AFFECiED T '.
IYES
PERFORM S,YSiEM
VfR,IjF-|«CAilON
iESi PR0C,
VERIFIED
SYSiEM
Figure 68.—Line Maintenance Operation
176
The only cost factors directly attributable to line maintenance are the incurred man-hours
and associated overhead. Therefore, to evaluate the cost impact of this operation, it was
necessary to make some time estimations for each of the activities involved. The time
estimations for the ARCS with and without the system test capability are summarized in
tables 17 and 18, respectively.
I Table 17.—Line Maintenance Activity With ••
\ System Test Capability *
' Activity
1. Deduce LRUi location of fault
»\Read back in-flight
i fault recording]
*,Perform system BITE test
2. Remove and replace faulted LRUa
3. Functionally test new unit
4. Cat 11 or 111 verification test
5. Clean up
Total
Time estimate,
minutes
faExcludes time required to obtain replacement unit from stock, which can require
if from 30 minutes to 2 hours. .....: —
! Table 18.—Line Maintenance Activity Without
I System Test Capability
\ Activity
Deduce LRU location of fault
!
~' Gather required test equipment
Perform maintenance manual
test procedure, or
! $ Assume from log entry
i 2. Remove and replace faulted LRUb
3. Functionally test new unit
j 4. Cat II or III verification test
5. Clean up
Total
i Time estimate,
minutes
15
60a
5
5
60a
5
'I 150 man-minutes:
< 2.5 man-hours
{aRequires two maintenance technicians at 30 minutes each. -
I Excludes time required to obtain replacement unit from stock, which can require
from 30 minutes to 2 hours.
177
One additional assumption was made with regard to the localization of a fault to a particular
LRU. Half of the time the localization will be directly deduced from the nature of the
squawk or, at most, only minimal testing will be performed. This assumption was based on
past experience which shows that 50% of the incurred failures were adequately handled by
"gate level" maintenance and that the other 50% required further testing. With this
assumption, the average line maintenance man-hour requirement without system test capa-
bility is 2.50 hours when localization testing is performed and 1.25 hours when the fault
location is assumed, or 1.88 hours per removal.
Computation of line maintenance cost becomes a function of the number of removals per
year, the man-hours required, and the burdened hourly labor rate:
Unit flying hours = (3 units/airplane) (350 000 airplane hours/year) = 1.05 x 10^ hours/year
with System Test
Unit _ iMan-
flying x Failures/
 x i Removals/ =\ Removals/ x !hours/ Cost/x Overhead = Total
hours/year lo6hours failure Vear removal hour factor cost
1.05 x l O 6 320 1.05 352.8 0.41 $8 2.85 $3297.97
or
$3300.00
without System Test
1.05 x 106 320 2 672 1.88 $8 2.85 $28804.61
or
$28 800
6.2.3.2 Shop Maintenance Costs
Once a unit is removed from the airplane, it becomes the responsibility of the shop mainten-
ance operation to make the necessary repairs to restore the unit to an operable condition and
return it for use. The shop maintenance operation can be viewed in three steps: first, the
unit must be tested to locate the source of the failure; second, the failed elements must be
repaired; and third, the unit must again be thoroughly tested to verify its fault-free operation.
This operation is shown in figure 69.
Shop maintenance costs accumulate from several sources: direct man-hour labor; overhead
computed as a function of direct labor; material expenses necessary for repairs; and the cost
of using automatic test equipment when applicable. Since the cost comparisons of interest
deal with the same basic ARCS hardware, it was assumed that there will be no difference in
material expenditures, and hence the following computation considers only the time-related
parameters.
The trend in current shop maintenance operations is to perform unit testing with the aid of
automatic test equipment (ATE). ATE simplifies shop maintenance procedures but adds an
additional hourly cost factor, estimated to be $23. It was assumed that the ARCS would be
tested using ATE in an airline maintenance program. That is, ATE will always be required
178
FAULT UfitLIZAfiON
TISf
TEST PROCEURE
STOCK
TO
Figure 69.—Shop Maintenance Operation
to perform the postrepair verification test, even though operator experience has shown that
when a sufficiently comprehensive built-in self-test is provided, it can often be used for fault
localization in lieu of ATE. The major ARCS element, i.e., the computer LRU, will certainly
include such capability as a part of the system test function, and it was assumed that this
will adequately cover 95% of all incurred failures as defined by the system test coverage goal.
The other 5% will require full use of ATE to localize the failure, and the ATE will be used to
make a 100% verification test of the unit's operation.
A review of ATE running times for contemporary analog flight control systems indicates
an average run time of 1.5 to 2.0 hours when no faults are found and when no operator inter-
vention is required. The only digital system for which data is available is the air data computer
for the DC-10, and it too requires 1.5 hours. The assumption was therefore made that the
ARCS would require 1.5 hours of ATE time to run with no intervention and no failures.
However, experience has shown that when ATE is.being used for troubleshooting/fault-isola-
tion purposes, the actual ATE time is about twice the minimum run time, or approximately
3 hours.
179
Another assumption made was that all unverified removals will require a full ATE checkout
before they can be declared operational. The shop maintenance technician cannot know
until after the full test that the removal was unjustified.
For the following calculations, a time profile for the shop maintenance cycle was derived
and is shown in table 19. For the ARCS without system test, all incurred faults are "non-
covered" faults. All unverified removals are treated the same with or without built-in
system test.
Table 19.—Shop Maintenance Operation
; Act[yity
1. Locate failed element
• Built-in test (BIT) time
! • ATE test time
2. Repair
3. ATE verification test
4. Miscellaneous
i Total man-hours/ATE-hours
JTime estimate, hours
faull
I 0.33
3.0
•
0.5
1 3.83/3.6
j BIT-covered
rfauTT
0.33
1.0
1.5
0.5 ,
{3.33/1.5
: Non-BIT-
j covered fault
0.33
3.0
1.0
1.5 -
3
0.5
i 6.33/4.5
The shop maintenance costs are the sum of the costs associated with "covered" faults plus
"noncovered" faults, plus the cost of unverified removals. These computations are as
follows:
Shop cost = 2 (number of occurrences) [man-hours/occurrence (cost/hour) + ATE-hours/
occurrence (ATE cost/hour)]
with System Test
No. of [" ATE- ATE "I Total
Event occurrences [.Man-hours x Cost/hour + hours x cost/hourj = cost
BIT-covered 336 x 95% 3.33 $22.80 1.5 $23 $35247.34
fault
Non-BIT- 336x5% 6.33 22.80 4.5 23 4163.44
covered
fault
No fault 352.8-336 3.83 22.80 3.0 23 2 626.24
$42 037.02
or
$42 000.00
180
without System Test
No. I" ATE ATE 1 Total
Event occurrences [Man-hours x Cost/hour + hours x cost/hourj = cost
Fault found 336 6.0 $22.80 4.5 $23 $ 80 740.80
No fault 672-336 3.5 22.80 3.0 23 49996.80
found
 $130737.60
or
$131 000.00
6.2.3.3 Spares Provisioning Costs
Because avionic equipment fails and it takes a finite amount of time to repair, an airline
must maintain a sufficient quantity of spares to support continuous fleet operation. The
costs associated with spares provisioning arise from two sources: the initial acquisition cost
and inventory or holding costs. The quantity of spare LRU's required to support an opera-
tion is in part dependent on the line demand rate, a direct result of removal rates. Inventory
or holding costs are normally computed as a percentage of initial acquisition cost.
Sufficient spare units must be available to satisfy the demand at each line maintenance
station and to fill the repair cycle pipeline. The formula derived for this is as follows:
Total spares = Base pool + in transit + line spares
The base pool consists of at least one "on-hand" spare plus the average number of LRU's in
the shop repair cycle, which is the product of the average daily usage times the average cycle
time. In transit simply says that it takes a certain amount of time to transport a failed
unit from the line station at which it was removed to the repair point. Line spares are
allocated to the various line stations in accordance with the projected demand, but as a
minimum, there will be one spare LRU supplied to each line maintenance station. For this
analysis, one spare LRU per station was assumed. For the operational models, these factors
were:
Base pool = 1 + (average daily usage) (5-day cycle time)
In transit = (average daily usage) (3-day transit time)
Line spares = (1 per line station) (21 line stations)
With System Test
_ . , ' .
 Cl352.8 removals/year\ f 352.8 removals/year \Total spares = 1 +51 —TT-: r-^ J + 3 I ^77-3 ; 1 + 2 1\ 365 days/year / \ 365 days/year /
= 30 units
Without System Test
_
 xl , , . / 672 removals/year \ / 672 removals/year \Total spares = 1 +S{ — ) +31 ——
 ; J +21\ 365 days/year / \ 365 days/year /
= 37 units
181
To derive the total cost difference attributable to initial spares provisioning, it is necessary
to know the acquisition cost per unit. The ARCS computer LRU's were estimated to cost
approximately $30 500 per unit in production quantities. Therefore, the cost of spare
computer LRU's was:
With system test = (30 units) ($30 500/unit) =$915000'
$915000
16 years = !$57 190/year
Without system test = (37 units) (30 500/unit) = $ 1 1 2 8 5 0 0
$1 128500
16 years = '• $70 500/year
Experience shows that the annual cost for holding spare LRU's is equal to approximately
25% of their purchase cost. Therefore, spares holding costs from the above purchase cost
are:
With system test = (0.25) ($ 915000) = $228800
Without system test = (0.25) ($1 128 500) = $282100
Therefore, the computed annual cost of providing spare computer elements for the ARCS
flight control system is:
With system test Without system test
Amortized acquisition cost $57000 $70500
Spares inventory cost $228800 $282100
Total $286 000 $353 000
6.2.3.4 Sensor System Maintenance Costs
The AFCS is a complex system involving much more than just control computers. Its
operational success is contingent upon proper operation of all of the sensor and servo
systems associated with it. A failure occurrence in any one of the peripheral sensor or
servo systems can potentially result in a degradation of AFCS performance and hence
initiate a maintenance action. Experience with contemporary flight control systems has
shown that the normal corrective action for most such squawks is to replace the major
element—the flight control computer. This practice is reflected in the low verification
rates currently achieved for these units. This effect has been taken into account in the
cost analysis by the assumption of a 50% unverified removal rate for ARCS computers
without system test capability. However, the reverse situation is also present, though to a
lesser degree; sensor systems are erroneously removed and replaced as a result of a flight
control problem. One important attribute of the system test function is the capability to
detect and register the occurrence of such sensor failures to minimize unjustified removals.
1821
A review of in-use experience with contemporary sensor systems on DC-10 and 747 fleets
indicates that approximately 7% of all removals of air data computers, ILS receivers, and
radio altimeters were initiated as a result of a squawk against the flight control system. Of
these 7%, approximately 60% proved to be unverified. With the achievement of the system
test goal of a 90% effectiveness in properly isolating sensor system failures, it can be projected
that the percentage of sensor removals due to flight control squawks could be reduced to
3.1%. This reduction in sensor removals will also have an effect on the system test cost
effectiveness.
Table 20 defines 1975 average removal rates for 747 and DC-10 avionic equipment, adjusted
to a fleet size indicative of the selected operational model. If we assume that without system
test the sensor removal rate will remain the same as for contemporary systems, we could
expect a saving of 3.9% of the total projected removals, equal to 36 unwarranted ILS receiver
removals per year, 34 radio altimeter removals, and 47 air data computer removals.
I Table 20.-Sensor System Removals
\
Equipment '
ILS receiver
Radio altimeter
Air data computer
]
Removals
I per year
228
216
300
j
Adjustment
:!f actor
1 050 000
260 000
1 050 000
260 000
1 050 000
260 000
1
 Projected
annual •
• removals
921
872
1212
Annual
AFCS- related
removals
With
system test3
28.65
27.13
37.71
Without
system test
64.47
61.04
84.84
| aBased on 3.111% removals due to flight control squawks •
\ Based on 7% removals due to flight control squawks
Table 21 identifies parameters pertinent to the computation of sensor maintenance costs.
Both line and shop maintenance time requirements have been obtained from current
operating experience and represent an average of 747 and DC-10 data.
i Table 21. —Sensor Maintenance Parameters '<-
jUnit",
1 LS receiver
Radio altimeter
Air data computer
'] Cost/unit
$ 4 800
$ 5550
$18070
Line maintenance,
~ main-hours pe'rH
j } removal r ;
"Jl.27;" }
/J1.29 \
Shop maintenance,
man-hours per
i removal ^ •
1 10.25 :!
i 4.21 '
j_7.81 ;
.183
Using these parameters, the following cost calculations show sensor maintenance effects on
system test cost-of-ownership. _ . - . _ . _
Line maintenance cost = 2 (no. of removals) (man-hours/removal) (cost/man-hour)
ILS:
Radio altimeter:
Air data computer:
Total
ILS:
Radio altimeter:
Air data computer:
Total
With System Test
28.65 (1.27 hours) ($22.80/hour)
27.13 (1.29 hours) ($22.80/hour)
37.71 (1.14 hours) ($22.80/hour)
Without System Test
64.47 (1.27 hours) ($22.80/hpur)
61.04 (1.29 hours) ($22.80/hour)
84.84 (1.14 hours) ($22.80/hour)
or
or
$ 829.59
797.95
980.16
S2607.70
$2600.00
$1866.79
1795.31
2205.16
$5867.26
$5900.00
Shop maintenance cost = 2 (no. of removals) (man-hours/removal) (cost/man:hour)
ILS:
Radio altimeter:
Air data computer:
Total
ILS:
Radio altimeter:
Air data computer:
Total
With System Test
28.65 (10.25-hours) C$22.80/hour) =
27.13 (4.21 hours) ($22.80/hour)
37.71 (7.81 hours) ($22.80/hour)
or
Without System Test
64.47 (10.25 hours) ($22.80/hour)
61.04(4.21 hours) ($22.80/hour)
84.84(7.81 hours) ($22.80/hour)
or
$6695.51
2604.15
6714.94
$16014.60
$16000.00
$15066.64
5 859.11
15 107.29
$36 033.04
$36000.00
Spares provisioning,cost*^ = 2 --^ - (Acquisition cost) + inventory cost16
, *To cover spares required for 8 days' supply: 5 days' repair cycle time, 3 days' transit time.
184
With System Test
Total acquisition cost $21 219.84
Annual acquisition cost = (1/16 x Total) = $1326.24
Inventory cost = 0.25 (total(acquisition cost)' = 5304.96
Total annual sensor spares cost . = $6631.20
or $6600.00
- Without System Test
ILS: 8 I -".I"'' 1 = 1.41 units at $4800 ea = $ 6 7 8 3/ 64.47 \v 365 ; -
/61.04\
 =
V 365 / "
/84.84\ _
r
'
 8
 \ 365 / "
Radio altimeter: 8 1
 g5 I = 1.34 units at $5500 ea =• 7358
Air data computer:  ( ..' ) = 1.86 units at $18 070. ea = 33601
Total acquisition cost $47 742
Annual acquisition cost = (1/16 x total) = $ 2984
Inventory cost = 0.25 (total acquisition cost) = 11 935
X
ILS: 81 ~'r I = 0.628 units at $4800 ea = $-3014.14 "'\ 365 /\ A ~^ —
Radio altimeter: 81—;^-J = 0.595 units at $5500 ea = -3270.47
Ak data computer: 8 (—^-r^) = 0.827 units at $18 070 ea = 14935.23
Total annual sensor spares cost = $14919
or $15000
6.2.3.5 Acquisition Costs
Acquisition costs attributable to the ARCS system test function are primarily__asspciatedjvith^ _<:;
.the system test panel (STP) and its associated interface electronics_and the_sy stem. test. . '••'..
software development. The estimated acquisition cost attributable to the_STP_and-its _ _. '- '
electronics was estimated to be $6000 based on a production run of 330 units. System test- :;-
software development costs were estimated to be 10% of the-total ARCS software cost, ''"'•;
which is approximately $ 1 million. - - 7
Ambrtized over the 16-year operational life of the equipment, these two acquisition costs :
yield an annual expense of approximately"$433 per airplane, of $65 000 for the'defined fleet. "'"_
> 185.-
6.2.3.6 System Test Cost Conclusions
Table 22 summarizes the potential improvement in annual direct maintenance costs that
can be expected from the inclusion of an integrated system test function. These cost factors
are expressed as annual costs for a fleet of 150 aircraft in the defined operating environment.
Taking the maintenance cost saving and subtracting the acquisition cost of system test
yields an annual saving of $1000 per airplane per year, or $16 000 over the 16-year
operational lifetime of each airplane.
\Table 22.— Maintenance Costs Per Year
Cost factor
Line maintenance
Computer
Sensor systems
Shop maintenance
Computer
Sensor systems
Spares provisioning
Computer
Sensor
Total
With system
[I] test U
$ 3 300
2600
42000
16000
< 286 000
6600
$356 500
Without system
ZI3test u
$ 28800
5900
131 000
36000
353 000
15000
$569 700
Difference
$ 25500
3300
89000
20000
67000
8400
$213200
6.2.4 COST/BENEFIT SUMMARY
ARCS showed an acquisition cost saving of approximately $34 000 per ship set compared
to the WWCS. This cost saving is attributable to the ARCS computer and I/O being placed
in a single unit, as well as to the ARCS redundancy management functions being imple-
mented primarily in software.
In assessing the number of weather-caused diversions that could be avoided because
of the ARCS greater availability, three parameters were considered: the number of
scheduled arrivals at airports with Cat II or Cat III ground facilities, the probability of
encountering Cat II or Cat III weather conditions, and the probability that the required
flight control system function was available. Using the defined airline model and a specific
route structure typical of United's 727 fleet, the ARCS-equipped airline can be expected
to incur approximately 69 fewer diversions per year as compared to the WWCS. Based
on an average United Airlines cost per diversion, this can save approximately $145 000
annually.
Including the system test maintenance feature in the ARCS can further save approximately
$150 000 for the same airline model. This cost saving is brought about largely by improving
the effectiveness (success probability) of the maintenance action, thereby reducing the
number of unwarranted equipment removals experienced with today's systems.
186,
; The cumulative effect of. system, acquisition'cost and avoided diversion expense for. the. -
.defined airline model is an annual cost saving of.approximately $495 000, more than
$3000 per aircraft. The inclusion of system test-yields a further saving of approximately
$ 1000 per aircraft per year. . . . « . . _ . *
Other cost savings attributable to the ARCS can-be expected as we.look deeper into the
cost-of-ownership picture—reduced spares, maintenance procedural improvements, dispatch
availability, etc.
;187_
7.0 ARCS IMPLEMENTATION
The ARCS study was the first
 rphase-of-a potential-three-phase NASA program-to specify,
design, implement, and test an advanced airborne reconfigurable computer system. In
, anticipation of a second phase —the ARCS hardware implementation —the question had to
be considered whether the best alternative would be to modify existing NASA equipment,
to adapt other suitable available hardware, or to build entirely new hardware. Included in
the ARCS work statement, therefore, were the tasks of assessing the feasibility of modifying
the GE MCP-703 (WW£S) to conform to the ARCS configuration and of identifying suitable
alternate, commercially available airborne digital systems adaptable to the ARCS design.
The following sections present the computer system design specification resulting from
the .design synthesis and analysis efforts of the ARCS study (sec. 7.1) and the assessment
of the feasibility of modifying the WWCS to the ARCS configuration (sec. 7.2).
7.1 ARCS DESIGN SPECIFICATION
This section specifies the functional, software, and hardware design requirements applicable
to an advanced reconfigurable computer system. The specification will serve as a basis
for evaluating the fault-tolerance capability of existing airborne computer systems intended
for application to commercial transports and for establishing the degree of modification
required, if any, to upgrade such systems to ARCS standards.
The design specification is divided into five section. The first section covers the real-time
aspects of the system, the second and third sections cover the ground test and on-line - .
monitoring aspects, respectively, and the fourth section covers the fault-tolerant aspects
of the system in detail. Functional, software, and hardware considerations are integrated into
each of these sections. .The last section covers hardware design requirements of a fault-
tolerant computer system.
7.1.1 ARCS REAL-TIME OPERATIONS
To ensure proper execution control of ARCS critical real-time operations, the software
design shall be functionally organized into modules that use only the basic single-entry/ -
single-exit constructs of concatenation, loop, and if-then-else. The organization of the soft-
ware shall_also_ reflect the data re_quirements of each_system module and result in a .' • 1.;
hierarchical data structure with minimum.intermodule data requirements.
The real-time operations shall be functionally structured into synchronous operations,
asynchronous operations,-and interrupt operations. Synchronous-and-interrupt-operations~
are further specified in the next two-sections. Asynchronous operations are povered in
section 7.1.3.~ . .
7.1.1.1 Synchronous Operations
The time-critical application functions (control law computations and mode control
logic) and redundancy management functions shall be processed in frame-time "~. • ; - •
—[iss:
synchronization. Synchronous operation is preferred to asynchronous to: (1) facilitate
bit-similar/bit-identical processing to eliminate the need for feedback equalization,
(2) allow timely fault localization/isolation and recovery initiation by minimizing processing
wait times insofar as cross-channel information exchanges are concerned, (3) allow tighter
fault detection thresholds, and (4) minimize cross-channel labeling and associated RAM
requirements.
Frame synchronization shall be performed by a software algorithm based on cross-
channel exchange of sync status information. The processing and interface requirements
for the application functions are specified in table 23. Redundancy management functions
are specified in section 7.1.4.
! Table 23.—Processing and Interface Requirements for ARCS
Required capability
Instruction (throughput, kops
(85% add, 10% multiply, 5% divide)
Minor frame time, ms
\ ROM size, K
RAM size, K
Analog inputs
Discrete inputs
Digital inputs
Analog outputs
Discrete outputs
Digital outputs
Near-term
application
300
20
;
 8
i "' 4
24
24
x 3
8
20
1
Intermediate-
term application
400
20
r 8
i 4 "
16
16
3
5
18
1
Far- term
application
600
10
L .8
! 4
42
12
4
14
18
2
7.1.1.2 Interrupt Operations
All processing control functions initiated by a hardware interrupt shall be functionally
organized into one software module. Hardware interrupt functions shall be provided
for the following purposes:
• Frame-time iteration reference
• Arithmetic overflow
• Memory parity fault
• System test panel input
• Power-on
All interrupts except power-on shall be maskable in software.
' 189
7.1.2 GROUND TEST OPERATIONS
-The ARCS automated built-in test and diagnostic function shall test for failures-within
the ARCS (sensors, computers, and servoactuators) and localize them to a line replaceable -
unit (LRU). The ground testing function shall be available on the ground only when
requested by the operator. - - -
Ground test shall use a "center out" test philosophy, wherein basic elements are tested
first and then used to test other functions (computer LRU followed by sensors and
concluding with servo systems). The computer testing shall include central processor,
hardware monitors, RAM and ROM memories, and input/output elements.
7.1.2.1 Processor Testing
The processor self-test shall perform an instruction test sequence within which all instruc-
tions or, as a minimum, all microinstructions are exercised, all registers involved, and
all addressing modes used.
7.1.2.2 Hardware Monitors
The ability of each of the computer hardware monitors to detect and enunciate the
existence of a fault condition shall be verified as part of the ground test.
The ability of the watchdog monitor to indicate a "good" state and a failed state, and
the ability of the arithmetic error detector to detect an overflow^condition and generate
the corresponding interrupt, shall be verified.
A parity generator inversion discrete, controllable by software, shall cause the words
to be stored in memory with even parity to provide a test of the operation of the
parity error interrupt. Where separate parity error detectors are used for different sections
of memory, this test shall be provided for each parity detector in the system.
7.1.2.3 Memory Testing
Read-only memory (ROM) shall be tested using sum checking techniques. Random
access memory (RAM) shall be checked by writing a predetermined data pattern into
all accessible RAM and comparing the_readout.
7.1.2.4 Input/Output Testing
The hardware shall include the capability to switch,-under-software control, all output
signals into all input channels.
7.1.2.5 Sensor Testing
Provisions shall be made for verifying the cTperatio'rial integrity of the sensors and their
associated interfaces. Sensor testing shall make use of sensor system self-testing capabilities
where available.
190
7.1.2.6 Servo Testing , ,
Servo subsystems shall be tested for staisfactory engagement/disengagement control,
dynamic response, and foree override characteristics. The servo ground test function-shall
be initiated only after a specific request from the-test operator.
7.1.2.7 Ground Test/Crew Interface
The interface between the flight or ground crew and the system test function is the
system test panel (STP), which shall provide the operator with the means to initiate
the test function and display the test results.
7.1.3 ASYNCHRONOUS OPERATIONS
Asynchronous operations are those tasks that need not be performed simultaneously
in all computers. The asynchronous operations shall include on-line self-testing, maintenance
data update, and STP processing.
On-line self-testing shall detect and/or localize failure conditions not detectable by
first-level system monitors. It shall include processor self-test/diagnostics, memory testing,
and input and output electronics testing. Input and output testing capability shall be
provided by a continuous wrap-around loop test utilizing dedicated output and input
channels.
The maintenance data update task shall centrally collect all failure data for maintenance
purposes, including failure information accumulated during the previous in-flight test-
and the failure status of the system monitors, i.e., SSFD, computer output monitor,
and servo monitor. The maintenance data update process shall resolve the LRU location
in which a failure has occurred and, if not previously done, record it in a nonvolatile
section of memory for future use by maintenance personnel.
The STP processing shall interface the on-line test program with the operator. It shall
accept operator request inputs, as well as process and format output data to be displayed
to the operator.
7.1.4 REDUNDANCY MANAGEMENT
Redundancy management, functions shall be structured.into four groups: cross-channel
synchronization, cross-channel-data transmission, reconfiguration,-and recovery update. -
Each of these functions shall be performed so that each redundant-channel can autonomously
assess the redundancy status of the system and, based-on this assessment, operate-in
the corresponding channel redundancy state. Under no circumstances shall one computer
operation, or combination of computer operations, interrupt the normal operation of
another computer.
7.1.4.1 Cross-Channel Synchronization
The synchronization process following recovery from a loss-of-sync fault shall be
identical to that following power-on. Initial synchronization and normal frame synchroni-
zation shall be performed by system software via setting and clearing.of sync indicators
that are cross-channel exchanged between each pair of computers.
7.1.4.2 Cross-Channel Data Transmission
All variable data that are input, output, or history of the computation during each
minor frame shall be spontaneously cross-channel exchanged between all computers.
The receiving computer has sole jurisdiction in the use of the data.
The data rates required for cross-channel transmission of variable data for the ARCS
applications are:
Near term Intermediate term Far term
20K 25K 50K
7.1.4.3 Reconfiguration
Reconfiguration includes fault detection, fault localization, and fault isolation of sensor,
computer, and servo functions.
Sensor signal selection and failure detection shall be performed by software using cross-
channel sensor data. The SSFD algorithm shall meet the following requirements.
• The sensor selection algorithm must be able to isolate the effects of all types of
faults, including open failures, hardover failures, and any ramp failures and oscillatory
failures.
• The signal selection algorithm must provide an output that is acceptable to the
application task for normal (unfailed) operation, and during any fault condition in
one of the redundant input signals.
• Means must exist to detect and isolate the effect of a sensor failure, and to revise
the failure-detection algorithm "to~be compatiblewith~the lowerRedundancy,
before the probability of incurring ah additional HkeTailurelHay be high enough
to create an unacceptable risk.
• The failure-detection algorithm must operate in the presence of normal signal tolerances,
such as biases, scale factors, and linearity errors, and^in the presence of noise.
• The proportion of "nuisance failures" (leaky transients) due to signal tolerances
and noise must be insignificant compared to the number of genuine failures.
192|
'
Fault monitoring shall be performed by software_on computational results p~rior td_o.utp.ut.
of servo commands. The output monitor shall establish -the validity of the local computer's
processing based on comparisons with other-computer-s results. If the output-monitor
determines that a local computer fault has occurred, it-shall initiate disengagement of
all affected servos. - - -7 - - - -
A software monitor shall check the integrity of the cross-channel data buses.
Recovery of a computer, as indicated by reestablished synchronization, shall cause
working computers to reset their failure status indicators. Recovery for a faulted computer
shall consist of updating its failure: status indicators and variable data history to a working
computer.
Pilot intervention shall-not be relied upon for system reconfiguration processes. The
system shall be designed with the capability-of automatic start and synchronization
after power turn-on and automatic restart and resynchronization after transient power
faults or massive transient signal faults.
7.1:5 HARDWARE DESIGN REQUIREMENTS
To support assumptions on which the ARCS design analyses were founded, and which
subsequently resulted in the selected ARCS configuration, the following specific hardware
design requirements shall apply.
7.1.5.1 Independent Computer Monitor
An independent, fail-safe monitor of the real-time operation of each computer shall be
an integral part of each computer unit. If the measured time interval between the
clearing of consecutive sync indicators is not within specified upper and lower limits
of the nominal frame time period, the independent monitor shall indicate a computer
fault condition. If the computer's sync indications return to within the required periodic
interval, the independent monitor shall clear the fault indication.
7.2.5.2 Channel Separation and LRU Packaging
Computer and interface hardware shall be dedicated on a per-channel basis. Based on
.maintenance and logistics considerations, a.single .LRUJbr .computer and.interface.,
electronics is preferred. _ - - ._ __ _ . _ . . _
The hardware design shall be constructed to survive a cable-failure as a single failure,
where all the wires in a cable may be open-end or shorted to ground. It shall-also survive,
as a single failure, the computer LRU sliding out of the receiving tray, where all pins
on the LRU connectors disengage almost simultaneously.
7.1.5.3 Design Integrity
The hardware shall be designed to withstand and suppress the effects of electrical hazards
including lightning stroke. Components in separate channels shall be completely isolated
electrically. Cross-channel date transfers between computers shall use serial, optical data links.
7.1.5.4 General Architecture
 t _ . _ " . " . ! . . _
The computer architecture shall be compatible with both ROM/RAM and core memory.
The hardware design shall be adaptable to minimum quadruplex redundancy .-A basic
word length of 16 bits with the capability of-efficient double precision, or a basic
word length of 24 bits, is required.
7.1.5.5 General Hardware Design Specification References
The hardware design shall meet the requirements specified by reference 3.
The components selected for the hardware design shall equal or exceed the following
standards.
Microelectronics B-l Mil-STD-883, Method 5004, B
Discrete semiconductors JAN Mil-M-38510, C
Capacitors—Aluminum R
(ER)
-All others M
(ER)
Resistors (ER) M
All other (non-ER) parts Mil-Spec
7.2 FEASIBILITY OF WWCS MODIFICATION
When the ARCS program first got under way, it was thought that one desirable method
of obtaining some actual hardware experience with many of the important ARCS
architectural concepts would be to modify the WWCS. The WWCS was to serve as the
point of departure and baseline system architecture for the ARCS study program. As
a result, a task was defined for the ARCS contractors to delineate those modifications
that could be made to the WWCS to bring it to a near-ARCS configuration at some
future time.
Since the ARCS study program began, the electronics industry, has made.enormous _.
^strides in larger scalejiitegratiori and lower jjpwer circuits.^ARCS jiaS-taken^advantage
of these developments in circuit technology and has moved nearly all the redundancy
management functions into software. Thus, the ARCS configuration represents a
dramatically different architecture than the WWCS.
This section presents a series of modifications that could be implemented in~the WWCS
to produce a near-ARCS structure for evaluating many of the ARCS concepts. Although
these modifications would permit exp~erimentation~with many of tKe ARCS~architectural
concepts, they result in a considerable decrease in available computational time for
control laws within the WWCS. Considering control law experimentation computer system,
these modifications are not recommended.
194
7.2.1 WWCS-ARCS COMPARISON , ,
The primary differences between the-ARCS-and-W-WCS are the following.
1. The WWCS is a TMR system designed around a combination of-hardware~and
software redundancy management, with the hardware playing a very strong role.
The ARCS is a triplex system designed to degrade in the face of multiple failures
to simplex operation for any module. The redundancy management is primarily
in software, although the independent servo monitor and watchdog monitor are
hardware functions.
2. The WWCS employs a fail-operational oscillator system to provide a bit-synchronous
source of timing for input and exchanges and to generate.the time base for real-time
operations. The ARCSxhannels perform operations using completely independent
timing sources and use interchannel discretes for-frame synchronization.
3. The WWCS uses relatively restricted computer-controlled interchannel data links
only for preflight test data exchange. The ARCS contains fast and flexible computer-
controlled interchannel data'links that employ a LIFO buffer in the transmitter
and DMA storage of the received data into dedicated zones in a multiport memory.
4. The ARCS contains an independent watchdog monitor configured to detect when
the processor is no longer a logical element and to disengage all the seryoactuators
connected to that channel. The WWCS contains_no such circuit since there is no ..
attempt to operate a single channel of.the WWCS by itself as a degraded system mode.
7.2.2 WWCS MODIFICATIONS
The following modifications could be made .without severely impacting the WWCS
hardware.
1. Six cables would be rerouted to defeat the bit-synchronous timing throughout the
system and permit each computer unit/interface unit pair to operate independently.
The interchannel exchange of raw sensor data would be severed, and therefore,
the utility of the hardware sensor select electronics would be deleted, which means
that sensor redundancy management must be performed in software. This change
must.be accompanied by_ item_2 jf_the_seryo transmit/receive unit_(STRU.)_is.to
_ remain,.a.part_of the system. „ ..
2. The interchannel cable-would-be modified to delete the interchannel exchange of
STRU data and timing. This modification would permit each-channel of-the STRU
to operate independently in conjunction with the corresponding computer and-
interface units. - . - . . _
3. The synchronous logic interface (SLI) electronics would be redesigned and"repro:
grammed to minimize the number" of memory cycles used for input/output during
each computational minor cycle consistent with interfacing with a single interface
unit. .
195
;4. " The status register would-be extended.by tw.o_bits.to accommodate interchannel
synchronization discretes and these discretes_wo.uld be wired across channel.-This
would give each processor means-for signaling-all-other processors for synchronization
without using the processor-controlled input/output system, which is-restricted
to background usage in the MGP-703 systemr This-expansion of the status register
could be accommodated on the~SLI circuit boards.
5. An improved interchannel data link would be designed and installed with a conventional
DMA structure on each end (transmitter and receiver) or with a FIFO or LIFO
buffer on the transmitter end if a comparable block of memory is given up. This
improved interchannel link could reside in the spare card locations that have been
opened up through the provisions of an external memory.
6. A watchdog monitor comparable to that designed for the ARCS would be installed
in the CIU in one of the spare card locations.
7. The processor would-be given access to the error reset discrete on a full-time basis.
8. Substantial software modifications required for the above hardware modifications
include the following.
a. A frame synchronization routine must be developed and integrated into the
WWCS executive.
b. Recovery procedures must be finalized and the code required must be integrated
into the WWCS software.
c. An extensive computer self-test routine must be developed.
d. A new BIT program must be written and integrated with the WWCS software
package.
e. A new interchannel driver routine must be developed.
f. The sensor redundancy management routine must be modified to account
for the different location of interchannel raw sensor data.
g. A new system test routine must.be developed.
Although many other areas of the software would be affected in a minor way,
these listed programs would be new or severely impacted. —
196
[8.0 CONCLUJDgsJG^SECTION
This section summarizes the results of the ARG-S study and presents the conclusions
drawn from the results. —'-—
8.1 SUMMARY OF RESULTS
The results of the criteria development, the conceptual design, and the analysis tasks
are restated below in summarized form. Complete information on these subjects resides
in sections 4 through 6.
8.1.1 ARCS DESIGN CRITERIA
The design criteria derivation had the objective of identifying economic and flight
safety considerations at the aircraft operational level that influence the specification
of on-board computer systems. Specific design requirements and design principles
for an airborne fault-tolerant (redundant) digital computer system were then formulated
from an interpretation of the design criteria, applying flight-critical system design practices
to avoid potential single-point system failures.
ARCS has the potential to reduce aircraft costs in three areas: the functional scope of
the system, which influences the initial system cost as well as the potential operational
benefits of the. system; the functional availability of the system, which determines the
degree of achieving the potential operational benefits; and the system maintainability,
which influences the burden of maintaining the system at the functional status required
to achieve the operational benefits.
A control/stability augmentation function and a Category III autoland function were
postulated to define the scope of the ARCS.
The economics of an airplane's on-schedule performance dictates a particular function's
desired or needed availability. This functional availability translates into computer system
design requirements that will set the system's failure survival capability. No explicit
functional availability criteria exist for the ARCS application functions. However, airlines
have gone on record expressing a desired average probability, for an operational Cat III
capability. An average pro.babilityjo.f haying an.o.perationaLCat IILcapability greater J;han
- 0.9 with a 100-hour maintenance intervaLwas-postulated as.a design-goal for the ARCS-.-
- design task. - -- — — —
Significant requirements identified for the third economic issue—system-maintainability— -
were that only "condition monitoring" maintenance be acceptable, that functional
integrity checks be automated, and that'system test features be" contained within" the ~
on-bbaid equipment.
Flight safety criteria are an essential part in assessing the airworthiness of an airplane and
its systems. Specifying safety-of-flight goals implies defining the level of accident risk,
or probability of accident, that can be tolerated for the total aircraft, as well asjhe.risk
; contribution _that can bejolerated because of .particular subsystems in the aircraft. _
.The basis for these requirements is establishe<iby-the_Federal Aviation .Regulations.
Interpretation of these regulations resulted-in-assessing the failure probability for the
CCV/FBW function to be less-than 1 x 10~9 for a 1-hour flight and failure probability
for the autoland function-to be-less than 1 x-1-0-7- for a 45-second exposure. —
Two fundamental design requirements for the fault-tolerant computer system'were
identified. First, the system must be capable of gracefully degrading from a triplex or
quadruplex redundancy level to a simplex string of operable elements—sensors, computer,
and servos. Second, the computer system must not be dependent on flight crew intervention
for startup or reconfiguration process initiation. The system must therefore be capable
of automatically establishing fully redundant operation following power turn:on and
must be able to automatically reestablish the highest operable level of redundancy following
transient fault conditions.
Designing against a system single-point failure mode is absolutely necessary if the design
is to meet the above operational requirements. Redundant channel interdependence is
a potential source for such a failure mode; therefore, a channel independence was deemed
to be an important requirement in the design of the reconfiguration processes. To
provide design guidelines to achieve this independence, the following design principles
were adopted: (1) each computer shall independently assess its own operational status;
(2) no computer operation, or combination of computer operations, shall interrupt the
normal operation of another computer; and (3) no servo shall be controlled by processes
outside its own channel.
Reliability predictions, on which the operational risks during the_use of the system are
based, rest on assumptions about the failure status of the system at any particular time.
The system failure status, therefore, must be established through a thorough verification -
process that immediately (or within a short time period) detects and localizes system
failures. -Further, the functional integrity of the system must be ensured after maintenance
action has taken place. A system test function is therefore required to assemble the
system failure data for the redundancy management process and to provide a means for
checking the system functional integrity. The system test function is also to enhance
system maintainability.
8.1.2 ARCS DESIGN CONCEPT
.In the .functional concept developed from the design requirements,,the essence^of the
ARCS is the reconfiguration processes.-In dealing with the various fault conditions that
can cause a transition between the possible states of the computer-subsystem, a single
reconfiguration strategy was adopted. With this strategy,-a-power-up operation-is
identical to recovery from a transient power fault. -_ . .
The general reconfiguration process involves one or" more of the subprocesses of fault
"detection, fault localization, fault isolation, and recoveryrbr~redu~Mancy"degradation.
Recovery from a transient fault will result in a restoFation of the operational state that
prevailed before the fault condition occurred. Failure of the recovery process, i.e.,
the fault is declared permanent, will result in a degradation in the redundancy state of the
affected stage.
The heart of the ARCS is the redundancy management software. It is the implementation .
in code of this redundancy management function that brings together both the hardware
and software aspects of the reconfiguration processes. -- - ---
The ARCS-concept emphasizes software-processes in achieving fault-tolerant-capabilities.
The ARCS hardware-must therefore be viewed as a-vehicle to facilitate an effective software
design for the overall reconfiguration and application processes.
The baseline ARCS configuration is a triplex system, readily expandable to|quadruplex,
with a single computer unit per channel, containing the processor, memory, and all
channel interface electronics. Sensor, mode control, and servo interfaces are dedicated
on a channel basis with data exchanged between computers via dedicated one_-way,
serial, optical, digital data buses that independently interconnect each computer to each
other computer. . Each computer has exclusive control over the engagement and shutdown
of-its own servos.
Key features of the computer unit design are a high throughput (420 kops) central
processor; a computer I/O system in which the input and output processes are performed
autonomously with no processor intervention required; and a solid-state memory partitioned
into functionally independent program memory (ROM) and variable/scratch-pad memory
(RAM). Input and output devices are embedded within the scratch-pad memory addressing
structure with their own scratch-pad area and are therefore directly addressable by the
processor.
Special hardware fault monitors, in addition to watchdog and servo monitors, include
an arithmetic fault detector, RAM memory parity detector, digital I/O validity checks,
and I/O loop testing provisions.
Real-time synchronization between computers is facilitated through the cross-channel
exchange of synchronization discretes independently generated within each computer.
8.1.3 ARCS DESIGN ANALYSIS
The fault tolerance and cost/benefits of an ARCS were compared with a contemporary
technology system to establish the advantages of the ARCS technology. Contemporary
technology (i.e., fail-operational/fail-passive capability in_a triplex configuration), used
for example in the 747 analog Category III.autpland_system,.was represented _by the GE .
MCP 703 WWCS. The system concept .was analyzed _with.respect tojts fault-tolerant _.
performance with two major objectives in mind: (1) to-verify,through-fault-analysis- —
- that the system-concept, in fact^had the required fault-tolerant qualities from a functional-
point of view and (2) to assess, through a reliability analysis,-the merit of those fault-tolerant
qualities from a probabilistic point of view. The reliability analysis task had two main
purposes: (1) to review the available reliability analysis methods and tools and to identify"
those suitable for use "in the ARCS study and (2) to assess the reliability of the ARCS design'
and configuration alternatives within the scope of the defined opeTatiorial applicatibhsT
An assessment of the cost effectiveness of applying ARCS technology was carried .
put in two parts: an analysis of airline cost-of-ownership for an ARCS maintained in .
a Cat III operational status and an analysis of the cost effect of providing an-integrated
system test function that.-could be used to give line replaceable unit—sensor, computer,
and servo—failure identification for line maintenance.
Simulation was used as a tool in the fault analysis as we began to evaluate thefreconfiguration
processes. The simulation demonstrated that the-ARCS fault-tolerant design-was a valid
concept, even though the sensor signal selection/failure detection algorithms to deal with
the duplex-to-simplex operation were not fully developed.
A new computer-aided redundant system reliability analysis tool (CARSRA) was developed.
CARSRA advances the current technology in reliability assessment by modeling effects
such as failure "coverage," transient faults, and intermodule dependencies, and possesses
the capability to compute system functional readiness (availability) and failure probability.
"Coverage" is the probability that the system continues to operate given a failure.
CARSRA uses a unique modeling technique, which consists of partitioning the system into
stages where a stage is-defined as a set of identical redundant modules. The possible functional
redundancy state of each stage is described by a Markov model, and failure dependencies
between stages are specified through a special entry to the CARSRA program. This
unique modeling technique makes it possible to describe nuisance failures and failure
coverage effects in the Markov model for each stage, thereby avoiding the problem of
having to deal with the exorbitant number of states that would result if the complete
system were to be modeled by a single Markov model.
The emphasis of the transient fault analysis effort was directed toward the prediction of
sensor nuisance failures since these have been a major problem in contemporary systems.
The analysis indicated that only a-few particular sensors are potentially troublesome-and that,
with careful design, nuisance failure indications could be controlled even for these sensors.
CARSRA was used to generate projected reliability data for an ARCS baseline and several
alternate configurations. It was shown that, assuming present-day failure rates, a quadruplex
ARCS-type system would meet the fly-by-wire reliability requirement, while a conventional
fail-op/fail-op/fail-passive system would fall short. An ARCS-implemented control/
stability augmentation function has a five times lower failure probability than does the
same function using the WWCS.
The probability of a system failure during a Cat III autoland, assuming that_all system
modules are functional at the alert height, was. 1.0 limes lower for_the~ ARCS-than for-the
WWCS. An advanced sensor system would result in an additional factor-of^-improvement
with the ARCS. Both the ARCS and-WWCS failure probability projections surpass-the
Cat III autoland requirements-by a wide margin. However, the ARCS was shown to also
meet these requirements with one failure existing at the alert heightr
Requiring that all system modules be functional at the Cat III alert height imposes a
significant penalty relative to the functional availability of autoland. By permitting an
autoland to be initiated with one modukrfailed (sensor; computer, or servo), ARCS
achieves a 0.93 availability, assuming a 100-hour maintenance interval. In comparison
with the conventional system (TMR) availability of 0.73, this means a factor of 4 reduction
in diversions due to Cat III autoland unavailability for the ARCS in comparison with-the
WWCS.
A voting node placement :trade study showed "that" a-signifleant improvement^ factor -; -
^of 10) in system functional reliability is achieved-by-providing a sensor signal voting
 r -—.
[node to a "brickwall" servo output voted configuration. A negligible improvement is obtained,
-however, when an additional voting node is introduced-between the computers-and-servo - —
-drive electronics. - * - - - - - — -
A new approach was developed for measuring "coverage" (likelihood of survival given
a fault) that is based on a failure analysis of a randomly selected set of failure modes
extracted from the entire failure mode population. This approach, shown to be feasible
by an analysis/laboratory experiment, may provide a cost-effective method for demonstrating
a fault-tolerant system's potential functional success probability.
Three primary factors were considered in conducting the cost/benefit analyses: acquisition
[cost, maintenance cost, and costs associated with schedule interruptionsj;aused by not
having an operational Cat II/Cat III system. The cost data for the-former- analysis was
compiled by comparing the ARCS with the WWCS, which represented a contemporary
triple-modular-redundant (fail-op/fail-passive) system design. The data base for the latter
study was the ARCS hardware without the maintenance enhancement elements, maintained
in a conventional manner.
Using the defined airline model (150 aircraft) and a specific route structure typical
of United's 727 fleet, the ARCS-equipped airline can be expected to incur approximately
69 fewer diversions per year because of weather conditions requiring low-visibility landing
[ capability. Basedjpn an average cost per diversion this can save approximately $145 000 .
annually. Low-visibility,operation is not standard procedure with all airlines today.
Category III autoland in particular is limited to a few airlines and airports worldwide.
Low-visibility capability is not a dispatch requirement but an airline option.
Including the system test maintenance feature in-the ARCS can result in a potential
further annual saving of approximately $150 000 for the same airline model. This cost
saving is brought about largely by improving the effectiveness (success probability) of
the maintenance action, thereby reducing the number of unwarranted equipment removals
experienced with today's systems.
The cumulative effect of system acquisition cost and avoided diversion expense, for the
defined airline model, is an annual cost saving of approximately $495 000, more than
$3000..per aircraft. The inclusions/ system-test yields a.further_saving of approximately —
$1000 per aircraft per year. .. _. . - _ . . - . - — ._ _ -
Other cost savings attributable to the ARCS can be expected as we look deeper into *
the cost-of-ownership picture—reduced spares, maintenance-procedural improvements, ' '
dispatch availability, etc. " " " "" ''
8.2 CONCLUSIONS
The ARCS program assessed, in a broad sense, the relevance of fault-tolerant computer
technology to commercial jet transport avionic applications. The overall conclusion of this
assessment is that the economic impact_of the ARCS technology is evolutionary rather
_. | 201 _
tthan revolutionary. The full benefit of fault-tolerant computer technology will be reaped n
when the functions provided by the avionic systemsjbecome dispatch critical or otherwise
[mandated .for use.on every flight. The greater.functional_survivability and availability. _
result from reconfiguration to.simplex. For some users, simplex may not-be an acceptable -
mode of operation. - - , ._ . . . . . . . . .
A significant accomplishment of the ARCS analysis task was the development of a reliability
analysis capability, including a computerized tool, that considerably improves the ability
to analyze redundant system architectures. The CARSRA program proved to be a powerful
tool essential for ARCS-type reliability assessments. The functional availability analysis
was an eye-opener in the sense that it showed that a TMR-type system will not achieve
postulated autoland functional availability goals.
All aspects of duplex-tp-simplex reconfiguration were not exhaustively covered by the ARCS
study; work remains in the areas of analytical redundancy, (fault monitoring of simplex
signals) and techniques for predicting-and evaluating "coverage"; i.e., the likelihood
of function survival given a fault in the duplex state. Assessment of "coverage"-is a central
problem in designing fault-tolerant systems. Credible reliability estimations hinge on the
levelrof confidence at which "coverage"-values can be estimated for duplex to simplex
operations. Application of an ARCS-developed method, based on random-fault insertion,
to a gate level simulation/emulation of a candidate computer is seen as the next step
in fault-tolerant technology consolidation.
The ARCS data supports application of fault-tolerant computer technology to increase
the effectiveness of flight-critical avionics at lower cost to the operator. Fault-tolerant
computing will be a significant element in achieving fly-by-wire and active controls
technology, as well as general use of Cat III capability, at acceptable cost.
Boeing Commercial Airplane Company
P. O. Box 3707
Seattle, Washington 98124
August 4, 1976
_. r—
j 202f *•.--•
APPENDIX A
ARCS APPLICATION MODEL CONTROL LAWS
This appendix describes control laws representative for an ARCS
application. Figure Al is a reference overview to the individual pitch,,
roll, yaw,- flutter and maneuver load control laws, for the autoland,
go-around and CCV/FBW functions which arc presented in Figures A2
through A] 0.
The exact complement of functions will depend on the specific vehicle
under consideration. .The ARCS Near Term application model was
assumed to include the functions described by Figures A2 through. A 5,
plus A8; the Intermediate Term the functions in Figure A2 through A8;
and the Far Term model all of the described functions.
Sensor and mode control interface requirements, and servo and display
interface requirements for the near, intermediate and far term appli-
cation models are .shown in Figures All through A l G .
'203^
•
—
 10
-J
 <
—
 
0
QZ
 Z
<
 o
Q
 K
z
 <
<
 H
S
 Z
s:
 LU
O
 2
1
0
 O^
1
 
<.
PQ00azrioO0
QZLUCC
Oo
:
a
.
 
»
<
 
PQ
(_)
 LU
 <
cc
 or
co
 <
 u
_i
 
_i
 LU
—
 L
U
 Q
I
 
I
 I
OL
 
_
LU
 Oo:0
J
-
 O
U
J
 O
<
 Q
.
 Z
oo<ro
IT)
•a
:
LU>
^o<_>oI—<Q-Q
.
o:^
ID
OI
—
O
-
CNI
ooc
1204
B
~
 
E
•J
 <
 
o
e
^
S
 
i/i^
S
-
^
g
S
 
g
p
S
 
S
3
«
Q
.>
,
S
^
 
IS
i
_J=T
u
 
>
-
 
*
•
J205
Oo
:
o<_>Q
o
;xui—
Q
_
N
~\
<CLUo:
L
_
<OQ
to<
<LLJ
X
oa;o
i_
>do•=>-£
.
<f
cCLUor
'
 5
ll
^
 
U
J
 O
'
 
Q
 QC
.
_
,
oo
:
207
FLAP
SIGNAL
(6)
ANGLE OF
ATTACK
(a)
GROUND
SPEED
<VGS>
PITCH
ANGLE
(6) .
FIGURE A5 GO-AROUND CONTROL LAW
ELEVATOR SERVO
COMMAND
ELEVATOR SERVO
COMMAND/6,.. N
\ eCCAS J
ROLL
ANGLE (#)
PITCH
RATE 6
0 POT (q)
FIGURE A6 PITCH COMMAND AUGMENTATION CONTROL LAW (WITH Y HOLD)
ROLL RATE
ROLL ANGLE («!)
REFERENCE
TRACK .ANGLE
—*J& <>N>-»- K
TRACK
HOLD
WHEEL FORCE
«-»•
Q POT (q)
FIGURE A7 ROLL COMMAND AUGMENTATION CONTROL LAW
0 POT (<i)
PEDAL FORCE
(+ RUDDER TRIM)
YAW RATE (
AILERON
SERVO
COMMAND
FIGURE A8 YAW COMMAND AUGMENTATION CONTROL LAW
® -X
RUDDER
SERVO
COMMAND
]209
ELEVATOR
SERVO
COMMAND
*
L WINGTIP
ACCELERATION (»lt)
PITCH RATE(9)
PITCH COMMAND (Oc)
Q POT (q)
ACCELERATION («z)
R WING
MIDSPAN !
ACCELERATION
L WING
MIDSPAN
ACCELERATION (a, )
R WINGTIP
FIGURE A9 MANEUVER AND GUST LOAD ALLEVIATION CONTROL LAW
PITCH RATE (9) N-
R WINGTIP J^ /Q
RATE (urt) -*V9-
L WINGTIP T
RATE(UU) 1
R WINGTIP ,
ACCELERATION (art)
L WINGTIP , +fc<V>
ACCELERATION Uu) "^V2>r
K
T
— »•
— ^
S+l
K
K
K
T S+l
-HS
-^
> ^y '
•> .
r
T S
T S+l
" T S
TS+I
1
T S+l
1
T S+l
hjh OUTBOARDAILERONSERVOCOMMAND'CA 'OiFSCHEDULED fcGAIN
TIP SURFACE
DRIVE SERVO
COMMAND
*h
SCHEDULED _
GAIN
FIGURE A10 FLUTTER SUPPRESSION CONTROL LAW
210
MODE CONTROL
PACKED SERIAL DATA
5 MODE DISCRETES
7 AIRCRAFT DISCRETES
SYSTEM TEST
WRAP AROUND
CONTROL FORCE SENSORS
CAPT PITCH & ROLL
F/0 PITCH & ROLL
ATTITUDES AND ACCEL
PITCH & ROLL ATT
HEADING, YAW RATE
LAT, NORM, LONG ACCEL
5 VALIDS
POSITION FEEDBACK SIGNALS
SERVO POSITIONS
CONTROL SURFACE POS
ILS RECEIVERS
LOC & G/S DEVIATIONS
3 VALIDS
RADIO ALTIMETER
COARSE & FINE ALT,VALlD
DIGITAL AIR DATA
DYN PRESSURE, ALT RATE,
ALTITUDE, AIRSPEED
NAV-GUIDANCE COMPUTER
VERT & LAT COMMANDS
RWY HDG
LOGIC DATA C>
TOTALS
INPUT SIGNALS
1 SERIAL BUS
12 X 28VDC DISCRETES
1 DC
4 X 26VAC 2 WIRE
3 X 26VAC 2 WIRE
1 X 26VAC 3 WIRE
3 X 15VDC
5 X 28VDC DISCRETES
9 X 26VAC 2 WIRE
2 DC
3 X 28 VDC DISCRETE
2 DC
1 X 28VDC DISCRETES
1 X 575 SERIAL BUS
1 SERIAL BUS
21 28VDC DISCRETES
16 26VAC 2 WIRE
1 26VAC 3 WIRE
8 DC ANALOG SIGNALS
3 SERIAL BUSES
FIGURE A-ll SENSOR AND MODE CONTROL BLOCK —
NEAR TERM APPLICATION MODEL
211
-•27 /
1 SERIAL BUS
4 X 28V DISCRETES
2 DC
1 X 28V DISCRETES
1 DC
1 28V DISCRETES
1 DC
1 28V DISCRETES
1 DC
1 28V DISCRETES
1 DC
1 28V DISCRETES
1 DC
1 28V DISCRETES
3 28V DISCRETES
1 DC
6 28V DISCRETESS
SUMMARY
19 28V DISCRETES
8 DC
1 SERIAL BUS
MODE CONTROL/DISPLAY
LAND ARM, ENG
CWS
AUTO-NAV
NONCRITICAL MODE DISPLAYS
APPROACH PROGRESS DISPLAY
ILS FLIGHT DIRECTOR
PITCH COMMAND
ROLL COMAND
VALID
RUDDER SERIES SERVO
SERVO COMMAND
SERVO ENGAGE
RUDDER PARALLEL SERVO
SERVO COMMAND
SERVO ENGAGE
PITCH SERVO
SERVO COMMAND
SERVO ENGAGE
ROLL SERVO
SERVO COMMAND
SERVO ENGAGE
AUTO THROTTLE SERVO
SERVO COMMAND
SERVO ENGAGE
AUTO TRIM
ARM
TRIM UP
TRIM DOWN
SYSTEM TEST COMMAND
WRAP AROUND (2)
ILS (2)
RADIO ALTIMETER
DIGITAL AIR DATA SYSTEM
NAV-GUIDANCE COMPUTER
212
FIGUREA-12SERVO AND DISPLAY BLOCK- .
NEAR TERM APPLICATION MODEL
MODE CONTROL
COCKPIT DISCRETES
AIRCRAFT DISCRETES
CONTROL FORCE SENSORS
CAPT P & R
F/0 P & R
POSITION
ELEV, AIL, RUDDER,
PITCH, ROLL, YAW, NOSE WHEEL
ILS RECIVERS
LOC & G/S DEV
3 VALIDS
RADIO ALTIMETERS
C & F/0 ALT
ISADS
ATTITUDES, RATES, ACCEL,
TKA, h, etc.
NAV-GUIDANCE COMP.
•>•
-*•
TOTALS
12 X 28VDC DISCRITES
4 X 26 VAC 2 WIRE
9 X 26VAC 2 WIRE
2 DC
3 DISCRETES
1 SERIAL BUS
1 SERIAL BUS
1 SERIAL BUS
15 28V DISCRETES
13 26VAC 2 WIRE
2 DC ANALOG
3 SERIAL BUSES
FIGURE A-13 SENSOR AND MODE CONTROL BLOCK —
INTERMEDIATE TERM APPLICATION MODEL
213;
4 28V DISCRETES
1 SERIAL BUS
1 DC
1 28V DISCRETE
1 DC
1 28V DISCRETE
1 DC
1 28V DISCRETE
1 DC
1 28V DISCRETE
3 28V DISCRETES
1 DC
6 28V DISCRETES
SUMMARY
17 28V DISCRETES
5 DC
1 SERIAL
MODE CONTROL/DISPLAY
LAND ARM, ENG
CWS
AUTO
CRT DISPLAYS
FLIGHT DIRECTOR COMMANDS
APPROACH PROGRESS
PITCH.SERVO
SERVO COMMAND
SERVO ENGAGE
ROLL SERVO
SERVO COMMAND
SERVO ENGAGE
YAW SERVO
SERVO COMMAND
SERVO ENGAGE
NOSE WHEEL SERVO
SERVO COMMAND
SERVO ENGAGE
AUTO TRIM
ARM
TRIM UP
TRIM DOWN
SYSTEM TEST COMMAND
WRAP AROUND (2)
ILS (2)
RADIO ALTIMETER
NAV GUIDANCE COMPUTER
ISADS
FIGURE A-WSERVO AND DISPLAY BLOCK
INTERMEDIATE TERM APPLICATION MODEL
Z/f [214
MODE CONTROL
COCKPIT DISCRETES
AIRCRAFT DISCRETES
CONTROL FORCE
CAPT I
 x ,
F/0 J
f PITCH
ROLL
YAW
P/Y/R TRIM
SERVO FEEDBACKS
POSITION]
&P Ij
PITCH
UPPER,LWR RUDDER
R,L AILERON
R,L MID SPOILER
R,L OTB SPOILER
R,L WINGTIP CONTROL
LR,L SUPPRESSOR
FLUTTER SUPPRESION SENSORS
2 RATE SENSORS (WINGTIP)
2,ACCELEROMETERS (WINGTIP)
MLS
RADIO ALTIMETER
ISADS
NAV-GUIDANCE COMP.
TOTALS
12 X 28VDC DISCRETES
12 X 26VAC 2 WIRE
26 X 26VAC 2 WIRE
4 X 26VAC 2 WIRE
1 SERIAL LINE
1 SERIAL LINE
1 SERIAL LINE
1 SERIAL LINE
12 X 28 VDC DISCRETES
42 X 26VAC 2 WIRE
4 SERIAL BUSES
FIGURE A-15 SENSOR AND MODE CONTROL BLOCK -
FAR TERM APPLICATION MODEL
215
MODE CONTROL/DISPLAY
1 SERIAL BUS
1 SERIAL BUS
1 DC
1 28 V DISCRETE
1 DC
1 28 V DISCRETE
1 DC
1 28 V DISCRETE
=t> MODE INDICATIONMLS SELECT DISPLAY
CRT DISPLAYS
=t> FLIGHT DIRECTORAPPROACH PROGRESS
PITCH CONTROL SURFACE
COMMAND
ENGAGE
UPPER RUDDER
(SAME)
LOWER RUDDER
(SAME)
LEFT AILERON RIGHT AILERON
2 DC
2 28 V DISCRETE
2 DC
2 28 V DISCRETE
2 DC
2 28 V DISCRETE
2 DC
2 28 V DISCRETE
2 DC
2 28 V DISCRETE
SUMMARY
(SAME) (SAME)
LEFT MIDSPAN SPOILER RIGHT MIDSPAN SPOILER
(SAME) (SAME)
LEFT TIP SPOILER RIGHT TIP SPOILER
(SAME) (SAME)
LEFT WING TIP FLUTTER CONTROL RIGHT WING TIP FLUTTER CONTROL
(SAME) (SAME)
LEFT OUTBOARD FLUTTER SUPPRESSOR RIGHT OUTBOARD FLUTTER SUPPRESSOR
(SAME) (SAME)
SYSTEM TEST COMMAND
18 28V DISCRETES
14 DC
2 SERIAL BUSES
1 DC
5 DISCRETES
WRAP AROUND (2)
MLS
RADIO ALTIMETER
ISADS
NAV-GUIDANCE COMP
FIGURE A-16 SERVO AND DISPLAY BLOCK -
FAR TERM APPLICATION BLOCK
APPENDIX B
SOFTWARE DESCRIPTION
B.I ARCS SOFTWAREJDESgRIPTION
This appendix describes the ARCS baseline software whose functional tree
is shown in Figures Bl-A, Bl-B, Bl-C, and Bl-D. The software structure
tree, shown in Figures B2-A, iBZIB, B2-C, and B2-JD, is derived from
| the Functional Tree and gives a short mnemonic name to each function.
The executive function of the ARCS software is contained within the module-
to-module flow of control shown by the state transition diagrams of Figures
B3-A and B3-B. Process iteration is governed by the minor frame schedule
shown by the table in Figure B4, where a major frame is subdivided into two
minor frames of 20 ms for the near term and four minor frames of 10 ms
for the far term models. This major frame subdivision facilitates schedul-
ing processes requiring iterations of 50 times a second (command augmentation
functions) and 20 times a second (autoland path guidance control functions).
The software module descriptions that follow are presented using the format
introduced in the software design methodology. Subsection B. 1.1 describes
those software modules associated with the high level functions of the ARCS
functional tree (reference Figure Bl-A), i. e., real time, ground test, interrupt
processing 11 synchronous [(foreground),! and 1 asynchronous, [(background)!executive
relationships.
ground
Subsection '
functions,1 the'
B.1.2 describes!i
redundancy
the heart
management^
lof j the
i
processes
ARCSii
covered
reali time'
by i| Figure
fore-
JB1-D.
Subsection B. 1.3 contains the software module descriptions of the application
control laws and associated mode control. Subsection Bl 1.4 contains the soft-
ware module descriptions of the background task. Subsection B. 1. 5 contains
the software module description of the ground test.
B. 1.1 ARCS High Level Software Modules
To facilitate the use and maintenance of software developed according to the
ARCS software design format each new module description should start at
the top of a page. With two exceptions, this procedure has been followed
throughout the following module descriptions.
217
C
QUJDC
LLJ
LUccOKO2Z>
LJL
_
COI•—I
PQ
Z
-
'
PREFLIGHT/
MAINTENANCE
TEST
INITIALIZE TEST DISPLAY
INTERRUPT
PROCESSING-
FIGURE Bl-B.GROUND.TEST FUNCTIONAL REQUIREMENTS
1219
co
 a:
coLlJ
 
_
J
<
co
 
•
—
 (
-
CO
 
«
<
 L
L
 0
UJLUa:coLUsLUa:aLUtoC^Ooa:o<
PQc_)HH^
CO
LUo2r>LULUOz<QQLU
OO«_>
O
C
<cLUo:13O
221
^
UJ<r•"•^l-u=>o:I-
o
oUJO
COCO<cICsJOQLUct:
222
1 
1 
• 1 
.·1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
N 
('J 
t 
'FlaUR;E B2-C ARCS SOFTWARE STRU€TUR'E 
225
i
 o-'>
i
.
&.~z.
 
•
•
 
<t
o
:
!^06
 f
t-fiffir UJo:'
226
C£
:
 ID
,
 
<QZ
.
oI—
:
COz:
<
•
or
I—
,
i
 
t
'
oz<QQLUcr:
OQ
;
 
i
 
•
LU
'
cc
.227
o
o
_l
o f
H 1 FLUTTER
o^ 1 MLA
O _J v
CO
QL
O
CO '
z
LU
CO
R WINGTIP RATE
L WINGTIP RATE
R WINGTIP ACC
L WINGTIP ACC
R WING MIDSPAN ACC
. L WING MIDSPAN ACC
40 MS*!
*Q*
X
X
X
X
X.
X
X
X
J
D
t
* -^
5 <£j _i
CO
o
co •
z
LU
CO
ROLL CAS
PITCH CAS
YAW CAS
ROLL A/L
PITCH A/L
YAW A/L
PITCH G/A
PITCH RATE (q)
NORM ACC (A2)
Q-POT
F-COL-C
F-COL-FO
HBARO
H
VGS
4
VPC
FLAP
oc
GSE
RADALT
ROLL RATEC^)
F-WHEEL-C
F-WHEEL-FO
TKA
HPC
F-PED-C
F-PED-FO
YAW RATE (R)
LOG
I*
Y
* 20
- (
HU MS •
-*10 MS*
*(2>
X
X
X
X
X
X
X
X
"10 MS*
-0-
X
X
X
X
X
X
X
X
MS
T\ *.)
X
X
x
x
V
X
X
x*\
x
y
«*10 MS*
*©*
X
X
X
X
X
X
X
x
ADDED FOR
FAR TERM
.f
MS H
x x
X
X
X
x
x
X
X
X
X
x
X
X
X
X
X
X
X
X
X
X
A^RWY
X
X
x
X
x
X
x
x
x
x
x
x
x
x
x
xf\
X
x
x
X
x
X
X
NEAR AND
. INTERMEDIATE
TERM
ONLY
228
FIGURE B4 SCHEDULING TABLE
The structured name for each module can be used to form an index to
the module descriptions. This index is given in Table Bl.
TABLE Bl: SOFTWARE INDEX
Structured Name
ARCS
ARCS.RELTIM
. INTPRO
. ITREST
. MEMPAR
. ARIOVR
. POWRDN
. INOUT
.FORGND
. FORGND. REDMGT
.MODCON
.CONLAW
. REDMGT. XCHSYN
.XCHDAT
.RECOVR
RE C FIG
. RECFIG. SENSOR
.COMPUT
. XCHMON
. SERVO
. COMPUT. MONFUN
. MONTRP
. FALSTA
.BAKGND
. ONLINE
. MANDAT
. STPRO
ARCS. GNDTST
. PREMAN
. INITIAL
.TEST
. DISPLY
229
ZZ?
Module: ARCS; Advanced Reconfigurable Computer System Software
a) Description: The ARCS software is broken into real-time and non-real time operations.
The real-time operations are started upon the occurrence of a power-on interrupt.
Non-real time operation is started upon a request from the system test panel for
ground test and verification that the aircraft is on the ground. Control is passed
back to the real time operation when ground test is complete or the aircraft starts
moving. Recovery must be requested upon power-on or ground test complete.
b) Sub-Processes:
RELTIM — Real time operations.
GNDTST — Non-real time operations.
c) Inputs and Outputs of the sub-processes:
RELTIM
Inputs
GTRD
Outputs
GNDTST
Inputs Outputs
GDTST GTRD
d) Data - element definitions:
GDTST — Ground test request allowed.
GTRD — Ground test request discrete.
RECOV — Recovery required indicator.
0 — Recovery not needed
1 — Power on recovery
2 — Watchdog monitor recovery
3 — Recover left channel
4 — Recover right channel
RECOV
230
J
Module: ARCS; (cont'd)
e) Transition Diagram:
Power-On
MRELTIM1 GT/GTRD/GDTST
-GT/GDTST/RECOV
f) Conditions
GT — The ground test condition.
True — ground test requested and the aircraft is stationary.
t
False — otherwise.
J231
• '
Module: ARCS. RELTIM; Real Time Operations
a) Description: The real time operation is composed of interrupt processing, foreground
tasks, and background tasks. Initial entry to this module is to the interrupt processing
4
task upon a power-on interrupt or to the foreground task where recovery will be per-
formed. When the foreground computation is complete, background is entered and\
continues until an iteration reset causes a transfer to interrupt processing or
ground test is requested and allowed. If the foreground task is not completed before
the iteration reset interrupt occurs, then a computation not complete flag is set.
b) Sub-processes:
FORGND — Foreground tasks
BAKGND — Background tasks
INTPRO — Interrupt processing
c) Inputs and Outputs of the sub-processes
FORGND BAKGND INTPRO
Inputs
RECOV
WATMON
Outputs
COMPLT
Inputs
GTRD
INTRET
Outputs
GDTST
Inputs
INT
Outputs
INTRET
RECOV
d) Data-element definitions:
COMPLT — Computation complete flag.
0 — Not complete
1 — Complete
GDTST — Ground test request allowed.
GTRD — Ground test request discrete.
INT — Interrupt.
INTRET — Interrupt return information.
RECOV — Recovery required flag.
WATMON — Watchdog monitor flag.
Module: ARCS. RELTIM; (cont'd)
/INT/
DUE TO x*N. RECOV/
INTERRUPTS A
 QA ,NTRET
/COMPLT/ />—x. GT/GTRD/
COMPLT / \GDTST
FORGND M BAKGND
-f) Conditions:
GT — Ground test.
1233
Module: ARCS. RELTIM. INTPRO; Interrupt Processing
a) Description: The interrupt processing task consists of interrupt handling for
the iteration reset/recovery, memory parity, arithmetic overflow, and power-on
interrupts. The power-on and iteration reset/recovery interrupts cause a transfer
to the foreground task with the recovery request flag set to the appropriate value.
The memory parity and arithmetic overflow interrupts cause their occurrence to be
recorded and control is returned to the point that the interrupt occurred.
The information that one of these interrupts occurred is used to localize a fault
given that an output monitor tripped.
b) Sub-processes:
ITREST — Iteration reset/recovery
MEMPAR — Memory parity
ARIOVR — Arithmetic overflow
POWRON — Power-On
INOUT — Input/Output
c) Inputs and Outputs of the sub-processes:
ITREST MEMPAR
Input
INT
Output Input
INTRET
RECOV
WATMON
INT
Output
INTMP
INOUT
Input Output
INT To be defined.
ARIOVR
Jnput
INT
Output
INTAO
d) Data-element definitions:
POWRON
Input
INT
Output
RECOV
WATMON
INTRET
INT — Interrupt
INTAO — Record of frame that last arithmetic overflow interrupt occurred.
INTMP — Record of frame that last memory parity interrupt occurred.
INTRET — Interrupt return infornation
RECOV — Recovery request flag
WATMON — Watchdog monitor status flag.
Module: ARCS. RELTIM. INTPRO.; (cont'd)
e) Transition Diagram:
INT INT X X /INT/RECOV, WATMON
JPOWROI
INT
MEMPARJ /INT/INTMP
/INT/INTRET, RECOV, WATMON
INT ^/~^\ (INOUT)
^^^^^T A f» T X^kT T T-fcl
/INT/INTAO
235 .
Module: ARCS. RELTIM. INTPRO. ITREST; Iteration Reset
a) Description: The iteration reset interrupt initiates a new frame. There
are two cases that must be considered depending on the status of the watch-
dog monitor.
The first case is when the watchdog monitor is tripped. In this case the recovery
flag is set to zero to indicate normal operation if a power on recovery is shown.
Otherwise, the recovery flag is set to two to request watchdog monitor trip
recovery and the RAM is initialized.
The second case is when the watchdog monitor is not tripped. In this case the
recovery indicator is set to one to indicate that a latent failure of the watchdog
has been detected, if the recovery indicator is showing power-on recovery.
Otherwise, the recovery flag is set to zero to indicate normal operation.
Module: ARCS. RELTIM. INTPRO. MEMPAR; Memory Parity
a) Description: The memory parity interrupt indicates an error in the parity
of a computer word. When this interrupt occurs, the occurrence will be
recorded by the local channel and control returned to the point of interrupt.
Module: ARCS. RELTIM. INTPRO. ARIOVR; Arithmetic Overflow
a) Description: An arithmetic overflow will be handled by limiting the affected
register to the maximum allowable, positive or negative value. The overflow •
will be recorded in the local channel and the control returned to the point the
process was interrupted.
Module: ARCS. RELTIM. INTPRO. POWRON; Power On
a) Description: The power-on interrupt will force computation to begin with this
function. The computer's RAM will be initialized and a power-on recovery
will be requested by setting the recovery flag to one.
236
Module: ARCS. RELTIM.
a) Description:
will
with
cause
INTRPO.I/OINT: Input/Output Interrupt j
1 I I j
For! purposes of
the
recovery
following
data and
the
action (to
transmitted.
ARCS
be
I
Concept]
taken.
After F"l
the
The
"stack
FIFO
recovery
empty"
stack j
da ta j lias'
will
interrupt i
H
been
filled
transmitted f
the interrupt will be masked. For the near term model approximately six buffer
loads* of recovery data will need to be transmitted each frame. This will require
about 1 ms of CPU time per minor frame to load the buffer. For the complete
transfer of data to the foreign channels it will take 8 ms.
237
Module: ARCS. RELTIM. FORGND; Foreground (Synchronous) Tasks
a) Description: The foreground tasks are redundancy management, control laws,
j and mode control which are processed in that order. Redundancy management t
among other functions produces inputs for the mode control and control law
processes. 'Mode control determines the flight mode based on pilot selection and
flight regime. The control law computes the servo commands and sets certain flight \
regime indicators.
A determination is made in redundancy management whether to trip the watch-
dog monitor. This determination is made based on system status as recorded
in the system st atus table.
b) Sub-processes:
REDMGT — Redundancy management
CONLAW — Control law
MODCON — Mode control
t
c) Inputs and Outputs of the sub-processes:
REDMGT
Inputs
SENSOR
DISCRT
COMPLT
Outputs
CONIN
MODIN
TRPWD
MODCON
Inputs | Outputs
MODIN
LAWDIS
MODE
d) Data-element definitions:
COMPLT — Computation Complete
CONIN — Control law inputs
CONOUT — Control law outputs
MODIN — Mode Control Inputs
LAWDIS — Control law computed flags
MODE — Selected flight mode
SENSOR — Sensor values
CONLAW
Inputs
MODE
CONIN
Outputs
COMPLT
CONOUT
LAWDIS
;238
Module: ARCS. RELTIM. FORGND; (cont'd)
e) Transition Diagram:
. MQD1N /CON1N. MOOIN.COMPIT ^ f \ /MODE, COD 1N/CONOUT .COMPLT / \ /MOD I N.LAWD IS/MODE
-MMODCON
f) Conditions:
— Watchdog monitor tripped.
B. 1.2 ARCS Redundancy Management Software Modules
An important aspect of the executive function associated with the Redun-
dancy Management operation is that tied to the reconfiguration process.
The reconfiguration process encompasses those functions required to assess
the system fault status. Figure B5 is a breakdown of the system status
indicators. This breakdown shows the hierarchical data structure of the
system status table which drives the reconfiguration process. Based on
the information contained in this table, the local computer can determine the
system redundancy level. The recognized system redundancy level, based
on synchronization data, will in turn set portions of the system status table
to effect redundancy management processing which is consistent with the
redundancy level up/down transitions.
The following is a module-by-module description of the Redundancy Manage-
ment software.
•
 24
°
WATCH-DOG 
MONITOR 
SYNC 
FORIEGN 
SYNC. 
COMMANDS 
MONITOR 
SENSORS 
SSFD 
PROCESSOR 
FAULTS 
SYSTEM 
S1A1US 
COMPUTERS 
OUTPUT 
MOf..JITOR 
SELF-TEST 
MEMORY 
FAULTS 
FAULT 
STATUS 
I/O 
X-CHANNEL 
INTEGRITY SERVOS 
LOCAL 
HARDWARE 
SUM-
CURRENTS 
MONITOR 
MEMORY 
PAIRTY 
AIRTHMETIC 
OVERFLOW 
ENGAGE 
STATUS 
MONITOR 
FIGURE' B5 TREE BREAKDOWN OF SYSTEM STATUS 
';.J - .:J I 
Module: ARCS. RELTIM. FORGND. REDMGT; Redundancy Management
a) Description: Redundancy management manages processes that are concerned
with multiple copies of the same information or the alignment in time of redun-
dant processes. Specifically, redundancy management consists of reconfiguration,
cross channel synchronization, recovery and cross channel data transfer.
The synchronization process produces synchronization status, channel identity, and
software minor frame count. This information is used by the reconfiguration process
in performing its monitoring function. The software frame count is used by the cross
channel transfer process in selecting the information to be transmitted.
The reconfiguration process may perform recovery based on its monitoring. In this
event the recovery process would use the cross channel data that was received from
one of the foreign channels.
b) Sub-processes:
XCHSYN — Cross channel synchronization
XCHDAT — Cross channel data transfer
RECFIG — Reconfiguration
RECOVR — Recovery processes
c) Inputs and Outputs of the Sub-processes:
XCHSYN
Inputs
RECOV
WATMON
COMPLT
Outputs
FRAME
LOCHAN
SYNC
COMPLT
TRPWD
RECOVR
Inputs
XCHDT
SYNC
RECOV
Outputs
SVEC
FRAME
XCHDAT RECFIG
Inputs
FRAME
SVEC
Outputs
To other
channels
CONOUT
MODE
Inputs
FRAME
LOCHAN
SYNC
CONOUT
Outputs
CONIN
DISCRT
RE CD AT
242.
d) Data-Element Definitions:
COMPLT — Computation Complete
RECOV — Recovery request flag
WATMON — Watchdog monitor status flag
FRAME — Local minor frame count
LOG HAN — Local channel identity
SYNC — Synchronization status word
CONOUT — Control law output
CONIN — Control law input
DISCRT — Discretes
RECDAT — Recovery request for data
XCHDT — Cross channel data for foreign channels
SVEC — State vector
TRPWD — Trip watchdog monitor
e) Transition Diagram:
-TW/TRPWD
XCHSYN I : XXCHDAT RECOVR
f) Conditions:
TW — Trip watchdog monitor
243
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN; Cross Channel Synchronization
a) Description: The cross channel synchronization process consists of a recovery,
channel identity and a normal synchronization process. First, channel identity
is determined then based on the watchdog monitor being tripped or recovery being
requested recovery synchronization is performed. If neither of those conditions
are true, then normal synchronization is performed, unless the previous frames
computation was not completed. When computation is not completed, the watchdog
monitor will be tripped by not setting and resetting the local sync commands.
b) Sub-processes:
CHANID — Channel identity word ^
RECSYN — Recovery synchronization
NORSYN — Normal synchronization
c) Inputs and Outputs of the Sub-processes:
CHANID RECSYN NORSYN
Inputs
COMPLT
LA
LB
WATMON
RECOV
Outputs
LOCHAN
TRPWD
Inputs
LA
LB
Outputs
SYNC
COMPLT
Inputs
LA
LB
Outputs
SYNC
COMPLT
t
d) Data-Element Definitions:
COMPLT — Computation complete
LA and LB — These indicators identify the local channel.
LOCHAN — Identity of the local channel
FRAME — Local minor frame count
RECOV — Recovery request flag
SYNC — Synchronization status word
TRPWD — Trip watchdog monitor
WATMON — Watchdog monitor status flag
e) Transition Diagram:
-CN/COMPLTA
R2/RECOV/
.—
Rl/RECON/
Conditions:
-Rl, -R2/RECOV/
CN — Computation complete
R2 — Recovery required (RECOV is 2)
Rl — Recovery required (RECOV is 1)
244
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN. CHANID; Channel Identification
a) Description: The channel identification process identifies the local, left, and
right channels. This is accomplished by checking the indicators LA and LB
shown in Table B2 and using this identity in Table B3 to establish local, left and
right.
LA
0
0
1
1
LB
0
1
0
1
Channel
A
—
B
C
Table B2: CHANNEL IDENTIFICATION
If Local is
A
B
C
Then Left is
C
A
B
Then Right is
B
C
A
Table B3: LEFT AND RIGHT CHANNEL IDENTIFICATION
245
b) Subprocesses:
GHANA — Local channel is A
CHANB — Local channel is B
CHANC — Local channel is C
c) Inputs and Outputs of the sub-processes:
GHANA
Inputs
LA
LB
Outputs
LOCH AN
LCHAN
RCHAN
CHANB
Inputs | Outputs
LA
LB
LOGHAN
LCHAN
RCHAN
d) Data-Element Definitions:
LA -
LB -
LOG HAN — Local channel
LCHAN — Left channel
RCHAN — Right channel
e) Transition Diagram:
/LA,LB/LOCHAN,CHANC
CHANB 1 ( CHANC \ rr-M X
/LA/LOCHAN,GHANA
f) Conditions:
Cl — LA is zero
C2 — LA is one and LB is zero
C3 — LA is one and LB is one
246
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN. CHANID. GHANA;
Local Channel is A
a) Description: The variable LOCHAN is set to zero, LCHAN is set to two,
and RCHAN is set to one.
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN. CHANID. CHANB;
Local Channel is B
a) Description: The variable LOCHAN is set to one, LCHAN is set to zero, and
RCHAN is set to two.
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN, CHAND. CHANG;
Local Channel is C
a) Description: The variable LOCHAN is set to three, LCHAN is set to one,
and RCHAN is set to two.
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN. RECSYN;
Recovery Synchronization
a) Description: The initial synchronization algorithm was described in Section 2.1.1.
Module: ARCS. RELTIM. FORGND. REDMGT. XCHSYN. NORSYN;
Normal Synchronization
a) Description: The normal synchronization algorithm was described in Section 2.1.1.
A recovery indicator will be set by this process to indicate when the local computer
first synchronizes with another computer. This indicator will be used to release
permanent failure flags for the recovering computer.
247 /7
Module: ARCS. RELTIM. FORGND. REDMGT. XCHDATr Cross Channel Transmission
a) , Description: This process will initiate cross channel transmission by unmasking
the buffer empty interrupt and transmitting that information that is needed for
this frame's computation and redundancy management.
b) Sub-processes:
W — Wait for transmission complete
UM — Unmask interrupt
SWT — Software frame to FIFO
SNT — Sensor Data to FIFO
SRT — Servo Data to FIFO
c) Inputs and outputs of the sub-processes:
UM
None
SWT
Inputs Outputs
FC to FIFO
SNT
Inputs Outputs
SRT
Inputs Outputs
Sensor TO FIFO Servo TO FIFO
Input.for Commands
Frame for Frame
FC FC
d) Data-Element Definitions
FC — Frame Count
FIFO — Buffer for cross channel transmission.
e) The Transition Diagram of
248)
Module: ARCS. RELTIM. FORGND. RE.DMGT. RECOVR; Recovery
a) Description: The recovery process consists of resetting permanent failure
flags in a working computer when it synchronizes with another and then proceed-
ing with its normal computations. Recovery for the faulted computer consists
of updating its flags and history to a working computer; and then proceeding with
normal computation.
b) Sub-processes:
UPDATE — Update flags and history
RESETL — Reset failure flags for the left computer.
RESETR — Reset failure flags for the right computer.
c) Inputs and Outputs of the sub-processes:
UPDATE RESETL
Inputs
LSYNC
RSYNC
LRCVR
RRCVR
Outputs Inputs
FRAME
SYSTAB
HISTORY
LSYNC
Outputs
LFFLAG
RESETR
Inputs Outputs
RSYNC
d) Data-Elements:
RFFLAG
LSYNC — Left sync flag
RSYNC - Right sync flag
LRCVR — Left receiver
RRCVR — Right receiver
FRAME — Local frame count
HISTORY — Local history
LFFLAG - Left failure flags
RFFLAG - Right failure flags
249
e) Transition Diagram: NR / RECOV /
RR / RECOV /
• .
RL /RECOV/
UP / RECOV /
f) Conditions:
NR — No recovery (RECOV is 0)
UP — Update local computer (RECOV is 1 or 2)
RL — Recover the left computer (RECOV is 3)
RR — Recover the right computer (RECOV is 4)
Module: ARCS. RELTIM. FORGND. REDMGT. RECOVR. UPDATE: Update Local
Information
a) Description: Based on the synchronization status the local computer will
either update its local information to the left or right channel. If the local
computer is not' synchronized with another computer then it will start proces-
sing and buildup its own history over a period of frames.
The flow chart of Figure B6 shows the flow of control for this process.
Each block of recovery data that is received by the recovering computer will
have a code word at the end of it. This word will indicate that new data has
been received. The word will be checked until it indicates new data received,
then this new data will be used to update the recovering computer's history.
At timing loop could be used here to assure that a cross-channel data fault does
not prevent the recovering computer from getting into operation.
The strategy would be to try the left, if that does not work try the right. If
neither the left or right is sending data, then in the presence of no other local
faults begin simplex operation.
Module: ARCS. RELTIM. FORGND. REDMGT. RECOVR. RESETL;
Reset Left
a) Description: The permanent failure flags for the left computer are reset.
S~\ The function is performed when the local working computer sees a previously
faulted computer come into synchronization.
Module: ARCS. RELTIM. FORGND. REDMGT. RECOVR. RESETR;
Reset Right
a) Description: The permanent failure flags for the right computer are reset.
Again, this is performed by a local working computer when a previously failed
computer comes into synchronization.
! 251
IS LOCAL
COMPUTER
SYCHRONIZED WITH
THE LEFT
UPDATE LOCAL
TO LEFT IS LOCALCOMPUTER
SYCHRONIZED
WITH RIGHT
UPDATE LOCAL
TO RIGHT
END
UPDATE
FIGURE B6 UPDATE FUNCTION
-,{252
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG; Reconfiguration
a) Description: The reconfiguration process is concerned with sensors, computers,
cross channel data links, and servos. The process is based on the monitors for
these four pieces of hardware and the results of the synchronization process. The
synchronization information is used to establish the overall redundancy level as
seen by the local channel. Monitoring of the various hardware components is used
to obtain the redundancy level of those components. This is presented in detail
when the design for the individual components is discussed.
The cross channel is monitored first to establish the validity of the cross channel
information. Second, the computer monitoring is performed to assure software
frame synchronization and the servo commands that were computed last frame. Third,
the servo process is performed. Finally, the sensor selection failure detection is
performed which starts a new set of computations that will be monitored next frame.
b) Sub-processes:
SENSOR — Sensor select failure detect.
COM PUT — Computer monitoring
XCHMON — Cross channel monitor
SERVO — Servo monitor and control
c) Inputs and Outputs of the Sub-processes:
SENSOR
Inputs
from the
sensors
XCHAN
Outputs
CONIN
PISCRT
XCHMON
Inputs | Outputs
XCHAN
LOCHAN
XCHVAL
COMPUT
Inputs | Outputs
CONOUT
XCHAN
FRAME
SERCON
SERVO
Jnputs
SERCON
CONOUT
Outputs
To the servos
d) Data-Element Definitions:
XCHAN — Cross channel information that was received.
CONIN — Control law inputs
MODIN — Mode Control Inputs
FALSTA — Local fault status
CONOUT — Control law outputs
SERCON — Servo control flag consisting of monitors and permanent flags.
XCHVAL — Cross channel valid word.
e) Transition Diagram:
//CONIN, MODI
1254J
Module: ARCS, RELTIM, FORGND, REDMGT, RECFIG, SENSOR, Sensors
a) Description: The basic function of this module is to provide the control law
process with necessary inputs based on sensor values from all operating channels.
In a given minor frame a number of sensor selection failure detection (SSFD)
processes will operate based on the minor frame count, as shown in the following
paragraph.
The particular sensor sets that must be processed in a given frame can be derived
from the scheduling table of Figure 2-20.
b) Sub-processes:
SSFO i — Sensor Select Failure Detect for frame 0, VNear
SSF1 — Sensor Select Failure Detect for frame 1 Term
SSF2 — Sensor Select Failure Detect for frame 2
SSF3 — Sensor Select Failure Detect for frame 3
' Far
, Term'
c) Inputs and Outputs to the Sub-processes:
SSFii
Inputs
SENSOR
SYSTAB
Outputs
CONIN
SYSTAB
d) Data-Element Definitions:
SENSOR — Sensor values
SYSTAB — System status table. The synchronization information will be used
as an input and the sensor information will be updated.
CONIN — Control law inputs.
e) Transition Diagram:
N is 1 for the near term
N is 2 for the far term
f) Conditions:
FO - FRAME is 0.
Fl - FRAME is 1.
FN - FRAME N.
256
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. SENSOR. SSFD,
Sensor Select Failure Detection
a) Description: The SSFD function is used for each sensor set.
For a typical SSFD the inputs are three sensor values A, B, and C, and the output
is the selected sensor value, CI. For recovery purposes a number of internal
variables must be available to the recovery process.
Figure B7 shows the functional breakdown of the SSFD function and the
block diagram for the SSFD. This figure will be used in the development of the
data-space and control flow.
b) Sub-processes:
AVER — Average of 3 or 2.
BIAS — Bias error calculation.
BECOMP — Bias error compensation
RECSEN — Reconfiguration based on fault detection.
c) Inputs and Outputs of the sub-processes:
AVER
Inputs
IUA
IUB
IUC
APRIME
BPRIME
C PRIME
Outputs
OUTAVE
BECOMP
Inputs
A
B
C
Outputs
A PRIME
BPRIME
C PRIME
BIAS
Inputs
OUT AVE
A
B
C
Outputs
BERA
BERB
BERC
nr.^
Inputs
BERA
BERB
BERC
APRIME
BPRIME
C PRIME
OUTAVE
A
CjLV
Outputs
IUA
IUB
IUC
APFRMF
BPFRMF
CPFRMF
B
C
257
BIAS
ERROR
COMPENSATION
-*•
FAULT
DETECTION -*•
AVERAGE
CALCULATION ->
BIAS
ERROR
CALCULATION
PROCESSING SEQUENCE
LEGEND
ONE FRAME DELAY
BIAS ERROR
COMPENSATION
2) RECONFIGURATION
ALGORITHM
AVERAGE
CALCULATION
BIAS ERROR
CALCULATION
A SENSOR
RAW DATA
«I«ER _1
INFORM,
A B C DO NOT
II I
DUPLEX TO
SIMPLEX
LOGIC
DO NOT USE
FAULT
DURATION
TIMER
AND
PERMAN,
FAULT
LATCH +
STATIC
FAULT
DETECTOR
DYNAMIC
FAULT
DETECTOR
B SENSOR
RAW DATA'
C SENSOR _r
RAW DATA"~n.
SENSOR B PROCESSING
SENSOR C PROCESSING
A
DO NOT USE
A B C
-Hi-
AVERAGE
OF .
THREE/TWO,
SIMPLEX
ALGORITHM
OUTPUT
FIGURED B7. CONTINUOUS SIGNAL SELECTION/FAULT DETECTION ALGORITHM
d) Data-Element Definitions:
IU — Where i = A, B, C Sensor i "use flag",i
A value of 1 means use the sensor in the process. IU. is from SYSTAB.
A, B, and C
OUTAVE
BEE,,i
i PRIME
i PRMF
— The three RAW sensor values.
— The average of A, B, and C as determined by IU..
— i = A, B, C. Bias error for the sensor i'.
— i = A, B, C. Compensated sensor value.
— i = A, B, C. Permanent failure flag which is in SYSTAB.
e) Transition Diagram:
259
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. COMPUT; Computer
Monitoring and Failure Assessment
a) Description: The functions that must be accomplished by this process are output
monitoring, ^monitor trip assessment, and failure status assessment.
b) Sub^processes:
MONFUN — Monitor Function
MONTRP — Monitor Trip Assessment
FALSTA — Failure Status Assessment
c) Inputs and Outputs of the sub-processes:
MONFUN MONTRP
Inputs
CON OUT
XCHAN
Outputs Inputs
MONITR
MONITR
 FALSTA
Inputs Outputs
COUNTS
MONITR
SERCON
Outputs
COUNTS
d) Data-Element Definitions;
CONOUT — Control law outputs
XCHAN — Cross Channel information that was received.
MONITR — Vector of the fifteen monitor flags given in Table 5.
COUNTS — Vector of the fifteen monitor trip counters defined in Table 6.
SERCON — Servo Control vector consisting of thirteen servo monitor flags
and thirteen permanent isolation flags.
e)
/CONOUT, XCHAN
MONITR
-MMONFUNJ H(MONTRP]
/MONITR/COUNT
/COUNTS, MONITR/SERCON
Module: ARCS. RELTIM. FORGND. REDMGT. REG FIG. COMPUT. MONFUN;
Output Monitor Function
a) Description: The computer monitoring process assesses the validity of the
local computer's computations based on foreign computer computations. The
monitoring process only makes sense for triplex and duplex operation. The
primary level of redundancy is determined by the channels synchronization
status and the secondary level is determined by the permanent fault flags and
monitor flags.
In Table B4 the eight possible cases for the three flags are shown and the good outputs
are indicated.
Case
0
1
2
3
4
5
6
7
A
0
0
0
0
1
1
1
1
Monitor Flags
B
0
0
1
1
0
0
1
1
C
0
1
0
1
0
1
0
1
Meaning
A, B, and C agree
A and B agree
A and C agree
Can not occur
B and C agree
Can not occur
Can not occur
A, B, and C disagree
Table B4: Monitor Flag Meaning
The monitor flags for outputs A, B, and C are set to indicate which outputs are in
disagreement. The comparison shown in Figure B8 can be for equality, that the values
are within some tolerance of each other, or that the signals are within tolerance of an
average of all valid signals.
261
MONITOR FUNCTION
SET MONITOR
FLAGS
A = LOCAL
B = LEFT
C = RIGHT
1 f
RESET A & B
MONITOR FLAGS
YES
NO ^^ DOES A\ YES
COMPARE TO B
NO X&OES A\ YES
COMPARE TO C
NO —nQES 3-v^  YES
COMPARE TO C NO ^nn A
COMPARE TO
RESET B & C
MONITOR FLAGS
RESET A & C
MONITOR FLAGS RESET A,B & C
MONITOR FLAGS
MONITOR
COMPLETE
FIGURE B8 TRIPLEX MONITORING
The output monitors are listed in Table BSW'ifti their associated monitor flags and the
allowable tolerance between values.
Monitor
MFCS
MUR
MLR
MLA
MRA
MLMS
MRMS
MLTS
MRTS
MLWTFC
MRWTFC
MLOFS
MROFS
M FRAME
MMODE
Tolerance
Equality
Equality
Description
Pitch Control Surface
Upper Rudder
Lower Rudder
Left Aileron
Right Aileron
Left Midspan Spoiler
Right Midspan Spoiler
Left Tip Spoiler
Right Tip Spoiler
Left Wing Tip Flutter Control
Right Wing Tip Flutter Control
Left Outboard Flutter Suppressor
Right Outboard Flutter Suppressor
Minor Frame Count
Flight Mode
Table B5: Output Monitors
The assessment of output disagreement in duplex can not be done by monitoring alone. The
monitor function for duplex is shown in figure B9. As seen from the figure either
the output monitor is tripped or not tripped.
I MONITOR FUNCTION
±
RESET MONITOR FLAG
YES NO
SET MONITOR
FLAG GOOD
FIGURE B9 DUPLEX MONITOR
263
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. COMPUT. MONTRP;
Monitor Trip Assessment
a) Description: The monitor trip assessment consists of counting the number of
monitor trips. In general, if a monitor is tripped the counter for that monitor
is incremented. The counters are given in Table BGwith the threshold at which
• a
a permanent fault will be declared.
Counter Threshold Monitor Tripped
CMPCS
CMUR
CMLR
CMLA
CMRA
CMLMS
CMRMS
CMLTS
CMRTS
CMLWTFC
CMRWTFC
CMLOFS
CMROFS
CMFRAME
CMMODE
Pitch Control Surface
Upper Rudder
Lower Rudder
Left Aileron
Right Aileron
Left Midspan Spoiler
Right Midspan Spoiler
Left Tip Spoiler
Right Tip Spoiler
Left Wing Tip Flutter Control
Right Wing Tip Flutter Control
Left Outboard Flutter Suppressor
Right Outboard Flutter Suppressor
Minor Frame Count
Flight Mode
Far
'Te'rm
* The counters are always greater than or equal to zero.
Table B6: Monitor Trip Counters
Figure HO shows the operations that must be accomplished to perform the
monitor trip assessment process in triplex. The duplex process is shown
in Figure Bll.
,|264
MONITOR TRIP
ARE ALL
MONITORS
RIPPE
S
HERE A
ASSOCIATED
LOCAL
PAUL
SET LOCAL
MONITOR TO
GOOD
IS
A MONITOR
RIPPE
DECREMENT A COUNT INCREMENT A COUNT
IS
B MONITOR
TRIPPED
DECREMENT B COUNT
IS \_ YES
C MONITOR
TRIPPE
DECREMENT C COUNT INCREMENT C COUNT
FIGURE BIO. TRIPLEX MONITOR TRIP ASSESSMENT
265
MONITOR TRIP
IS
THERE AN
ASSOCIATED LOCAL
FAULT
INCREMENT LOCAL
MONITOR TRIP
COUNT
DECREMENT
LOCAL COUNT
FIGURE BI1 DUPLEX MONITOR TRIP ASSESSMENT
266
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. COMPUT. FALSTA;
Failure Status Assessment
a) Description: The failure status assessment consists of comparing the monitor
trip counts to the count thresholds given in Table B6. in the case of the computed
servo commands setting a local permanent failure this will result in disengaging the
affected servo and degrading that monitor to the next lowest redundancy level.
In the case of a foreign servo output monitor trip the associated monitor will be
degraded. For the local minor frame or flight mode permanent fault the watch-
dog monitor will be tripped after a permanent fault is declared.
Figure B12 shows the operation of the failure status assessment. The permanent
flags that are set are defined in Table B7.
IS THE
'COUNTER GREATER"
THAN THE
THRESHOLD
/
YES
NO
SET PERMANENT
FAULT FLAG
ir
FIGURE B12 GENERAL FAILURE STATUS ASSESSMENT
267
tPermanent
Fault Flags
PMPCS
PMUR
PMLR
PMLA
PMRA
PMLMS
PMRMS
PMLTS
PMRTS
PMLWTFC
PMRWTFC
PMLOFS
PMROFS
PM FRAME
PMMODE
Failure Declared
Pitch Control Surface
Upper Rudder
Lower Rudder
Left Aileron
Right Aileron
Left Midspan Spoiler
Right Midspan Spoiler
Left Tip Spoiler
Right Tip Spoiler
Left Wing Tip Flutter Control
Right Wing Tip Flutter Control
Left Outboard Flutter Suppressor
Right Outboard Flutter Suppressor
Minor Frame Count
Flight Mode
Far
Term
Table B7: Permanent Failure Flags
268
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. XCHMON;
Cross-Channel Monitor
a) Description: The cross-channel monitor process consists of comparing the
local channel integrity word to those received from the foreign channels. In
the event that the system is simplex the cross-channel monitor does nothing.
The output of this process is a word that indicates the validity of the cross-
channel data in the left and right receivers.
269
Module: ARCS. RELTIM. FORGND. REDMGT. RECFIG. SERVO: Servo Monitoring
and Control
a) Description: A functional breakdown of the servo management task is shown
in Figure B13. The function of status assessment is comprised determining
the failure and engagement status of the servos. The hardware servo monitor
function takes care of the simplex and duplex cases. In triplex there is a require-
ment for software servo monitoring of the servo sum current (A p) to access
servo operation. The following paragraphs discuss the servo functions shown
in Figure B13.
270
OS
:
©<oLUUJsctt:UJ
oo
CQLU271
Failure Status Assessment — This function is a test of the current failure
status of the servo system. If any failures have been registered the system is
no longer in a triplex configuration and hence no attempt will be made to software
monitor servo operation. Under these conditions control will be transferred
directly to output control, otherwise control will proceed through sum current
monitoring.
Engagement Status Assessment — If no failures have been registered, yet
one or more of the foreign channelsbhave been disengaged, a vote of the three
sum currents will not be made and the affected servo fault condition will be regis-
tered by incrementing its associated fault counter.
Foreign Sum Current Fault Detector — The three actuator sum currents (pro-
portional to A p) will be brought together and voted for the purpose of localizing
a faulted actuator. If a foreign channel actuator is detected as having faulted,
control will be transferred to the Foreign Failure Monitor where a fault record
will be made in the form of a fault counter.
Local Sum Current Fault Detector Local servo faults detected by the sum
current comparison will be accumulated by a local fault counter which is a part
of the Engagement Control. If no fault is detected control will transfer to a
check of the local hardware failure monitors.
Foreign Failure Monitor — Fault status information from the foreign channels
will be gathered by the foreign failure monitor from the engagement checks and
the sum current monitor. Fault status will be accumulated in the form of a fault
counter. When the fault count exceeds a pre-determined maximum limit a foreign
failure will be declared, and the servo management function will then, on the
next pass, revert to duplex mode of operation which means no software servo
monitoring.
272
Local Hardware Monitor Checks — If the sum current comparison indicates a "
"no fault" condition, control will transition to a check of the local channelLs
hardware monitors. The engagement status in the local channel is checked.
If it is engaged the local fault counter will be decremented. (The fault count
will be limited to a minimum value of zero)) If the actuator is disengaged with
a good sum current comparison, further checks will be made of the watchdog
monitor, power supply monitor, and software permanent fault flags status.
If the disengagement was caused by either of the hardware monitors, there is
nothing that the servo management can do to affect servo engagement. However,
if both of these monitors are indicating a good state an attempt will be made to
re-engage the servo actuator if the software flag is reset by commanding a shut-
down followed by re-engagement. The result of this action is that a reset pulse
will be sent to the servo, which, if the condition which caused the disengagement
has been cleared, will cause the servo actuator to be re-engaged.
Engagement Control — This portion of the servo management function has
control over the shutdown and/or engagenne nt of the servo actuator. A permanent
shutdown will result either when the fault counter exceeds its maximum allowable
level or a computer failure has been declared by the output monitor thereby
dictating that the associated servo be disengaged. Servo re-engagement will be
allowed only following a transient fault of the independent hardware monitor or
a temporary loss of hydraulic pressure.
Output Control — This process simply consists of a test of the computer fault
status from which the decision is made whether or not to update the servo com-
mand output. For example, in duplex operation, when an output fault is detected
by the output monitor, both systems outputs can be held until a fault location
decidion can be made.
1273
b) None
d) Data-Element Definitions:
FED
FSC
LSC
FST
FFC
LFC
WD
PSM
Foreign Engagement Discretes
Foreign Sum Currents
Local Sum Currents
Failure Status Table
Foreign Fault Counter
Local Fault Counter
Watchdog Monitor Flag
Power Supply Monitor Status Flag
e) Transition Diagram:
FOREIGN
FAILURE
MONITOR
FOREIGN SU
CURRENT
FAULT
DETECTOR
FAILUR
STATUS
ASSESS
ENGAGEMEN
STATUS
ASSESS
HDW
MONITOR
HECKS
f) Conditions:
SF
FD
LFD
FFD
CFD
Servo Failure
Foreign Channel Disengaged
Local Fault Detected
Foreign Fault Detected
Computer Fault Detected
275;
B. 1.3 Application Software Modules
The ARCS application software modules perform the processes of control
law computation and mode control logic for those control laws.
276
Module: ARCS. RELTIM. FORGND. MODCON; Mode Control
a) Description: The mode control tasks are mode select, mode logic and mode
annunciation. Mode control inputs are discrete inputs representing pilot selec-
tions, airplane discretes and control law discretes. Outputs are flags that
' determine the set of control law algorithms to be processed as a result of the
pilot's selections, the airplane configuration, flight condition, and progress
along the flight path.
b) Sub-processes:
MODSEL — Mode selection
MODLOG — Mode logic
MODANN — Mode annunciation
c) Inputs and Outputs of the Sub-processes:
MODSEL
Inputs
MODIN
Outputs
SELOUT
MODLOG
Inputs | Outputs
SELOUT
LAWDIS
MODE
ANNIN
MODANN
Inputs Outputs
ANNIN AN LITE
d) Data-Element Definitions:
(MODIN — FORGND)
(MODE - FORGND)
(LAWDIS - FORGND)
SELOUT — Mode selection outputs
ANNIN — Mode annunciation inputs
AN LITE — Mode annunciation outputs
e) Transition Diagram
/MODIN/SELOUT /SELOUT, LAWDIS/MODE, ANNIN
-MMODLOG
/ANNIN/ANLITE
277
Module: ARCS. RELTIM. FORGND. CONLAW; Control Law
a) Description: The near term application model control laws are tho'se selected
for the baseline software.- The control laws are functionally broken out by
the surfaces they control. The control law block diagrams are discussed in Appendix I.
Figure B14 shows the functional treejfq^^h^Go^^ol laws. Figure Bl 5 shows the inputs
and outputs of the control laws.
The scheduling of the control is a function of the flight mode requested by the pilot
and the current minor frame.
b) Sub-processes:
ELVATR — Elevator Function
AILRON — Aileron Function
INBSPL — Inboard Spoilers
RUDDER — Rudder Function
c) Inputs and Outputs of the Sub-processes:
ELVATR
Inputs
col c
Fcol F
"baro
VGS
0
VPC
FLAP
a
GSE
RADalt
Outputs
ELEVC
AILRON
Inputs I Outputs
Q-pot
0
P.
Awheel c
Fwheel F
TKA
HFC
LOC
y
• •
y
LFTAIL
RGTAIL
278;
LUUl
OC-JO(X
.
U
J
o
:
279
'BARO
PITCH CAS
PITCH GO-AROUND
PITCH AUTOLAND
ELEVATOR
FIGURE B15 INPUTS AND OUTPUTS OF CONTROL LAWS
280
SENSORS CONTROL LAW CONTROL SURFACE
ROLL CAS
YAW CAS
LEFT
AILERON
RIGHT
AILERON
LEFT
INBOARD
SPOILER
RIGHT
INBOARD
SPOILER
£ UPPERRUDDERLOWERRUDDER
ROLL
AUTOLAND
YAW
AUTOLAND
FIGURE B15 INPUTS AND OUTPUTS OF CONTROL LAWS
281
c) (cont'd)
INBSPO
Inputs
Q-pot
*
P
Awheel c
Fwheel F
TKA
HPC
RADalt
LOG
•
y
y
&!/<
RUDDER
Outputs Inputs
Fpedal c
Fpedal F
runwy
LINSPO
RINSPO
Outputs
UPRUD
LORUD
d) The Data-Elements:
LFTAIL
RGTAIL
ELEVC
LINSPO
RINSPO
UPRUD
LORUD
q
azQ-pot
Fcol c
Fcol F
VGS
4>
VPC
FLAP
•a
GSE
RAD
P
alt
— Left aileron command
— Right aileron command
— Elevator command
— Left inboard spoiler
— Right inboard spoiler
— Upper Rudder
— Lower Rudder
pitch rate
normal acceleration (aircraft)
dynamic reference pressure computed from air data
Captain's column force
F/0
barometric altitude rate
vertical acceleration (aircraft)
ground speed
bank angle
vertical path command (from nav computer)
flap position information
angle-of-attack
glide slope deviation
radar altitude
roll rate
_ 2$?-
282
d) (cont'd)
Fwheel
 c
wheel F
TKA
HPC
Fpedalc
Fpedal p
r
Loc
y
y
Captain's wheel force
F/O wheel force
track angle
horizontal path command (from nav computer)
Captain's pedal force
P/O pedal force
yaw rate
localize r deviation
cross track velocity
cross track acceleration
runway heading
heading
e) Transition Diagram:
B.I.4 System Test
The purpose of system test is to provide preflight tests and in-flight
tests. The preflight (or ground tests) are discussed in section B. 1.4.1.
The in-flight tests are discussed in section B. 1.4.2.
B. 1.4.1 Ground Test
I The ground tests provide preflight monitoring to assess flight worthiness.
284
[Module: ARCS. GNDTST: Ground Test' ,
a) Description: The ground test operation is composed of interrupt processing and
preflight/maintenance testing. Entry into this module will only occur when such
a request has been generated by the operator sand the aircraft is stationary on the
ground. The transition out of real time operation occurs from the background
(asynchronous) tasks where both of these criteria are assessed. Ground testing
will test and hence ascertain the operational integrity of the entire ARCS system
including sensors, computers and servos.
b) Sub-Processes:
PREMAN — Preflight/maintenance testing
INTPRO — Interrupt processing
285
Module: ARCS. GNDtST. INTPRO: Interrupt Processing
a) Description: Though ground test is primarily an asynchronous operation the
timer interrupt normally serving as the synchronization time base is still allowed
to occur to serve as a time base for those tests which are time critical. Also
upon the occurrence of a timer interrupt a test is made of the current "on-ground"
status and the STP request buffer. The test program is exited if the on-ground
test is false or if the test operator has requested that the test be ended. Follow-
ing the processing of the timer interrupt control will return either to a continu-
ation of ground test or to real time control as shown in Figure B16.
286
TIMER
INTERRUPT
1
/ GROJUN
V\ Tt SfT
OJ D \
rr ?../
NO
YES
NO
•
•WWfiL SPIN
YES
END TEST
REQUESTED ?L>
YES
SET
RECOVERY
FLAG
1
NO
p
RETURN TO
GROUND TEST
GO TO
REAL TIME
FIGURE B16 GROUND TEST INTERRUPT PROCESSING
287
Module: ARCS. GNDTST. PREMAN: Preflight/Maintenance Testing
a) Description: Ground tests are comprised of both the preflight test and post
flight or maintenance test functions. The functional tree for this module is
shown in Figure B17.
The functions which will be included in the preflight test can be categorized
into two groups: those that are dispatch required as defined by the airframe
manufacturer and Federal Aviation Administration due to flight safety consider-
ations; and those mandated by the individual airline as needed for their specific
operations. Maintenance level testing on the other hand must include all aspects
of the AFCS including non-critical sensor and servo systems.
A request for ground test is generated by the operator via the System Test Panel
(STP). This information is acknowledged by the Background Test via the STP
input data table. On-ground status is determined by examination of two discrete
status signals, the main landing gear squat switches and the wheel spin discrete.
Activation of the ground test function is inhibited unless all discretes show the
aircraft to be stationary. Following completion of ground test, control will be
returned to Real Time Operations. Ground testing will be terminated whenever
either the test procedure is completed in a normal manner or the ground test
function is deselected by the test operator or the on-ground status ceases, as
would be the case if the aircraft begins to foil. This is effected by setting
the "Recovery Flag" prior to exitihggfrom the ground test program which will
force the Real Time Operation to perform a RAM initialization followed by
recovery synchronization.
b) Sub-Processes:
INITIAL — Initialize
TEST — Test
DISPL — Display
288
-mat
O)V-zLUr>©LU<czo00LULUOLU
0
0LUo:
r?289
d) Data Elements:
SIT — STP Input Table
DOT — Discrete Output Table
PST — Precondition Status Table
FST — Failure Status Table
DIT — Discrete Input Table
e) Transition Diagram:
PS/DIT
/DOT/PST
f) Conditions:
PS
TCR
— Preconditions Satisfied
— Test Continue Request
290
Module: ARCS. GNDTST. PREMAN. INITAL: Initialize
a) Description: Initialization consists of all those tasks necessary to put the system
into a configuration from which testing can be safely and accurately conducted.
This involves both the conditioning of RAM data and the assessment of required
preconditions. Conditioning of RAM data includes forcing a bypass of all servo
actuators since during the conduct of the test all aspects of the computer subsystem
will be exercised in some fashion and to allow servo actuators to be engaged during
this time presents a potentially hazardous condition. For ARCS System Test the
assessment of preconditions reduces to two tasks: to check the interface to and
from the STP, and to check for the existence of sensor valid discretes.
i
If the interface link to or from the STP is inoperative or the STP itself has failed
no further testing will be attempted. Following confirmation of the operational
integrity of the STP and its associated data links each of the sensors are then
checked for the existence of a proper validity indication. It is assumed for this
check that the absence of a valid signal at this point in the test is most likely
due to lack of power to the associated sensor. This situation will then be brought
to the attention of the test operator via the STP who can then take corrective action.
291 , t
Module: ARCS. GNDTST. PREMAN. TEST: Test
a) Description: The test function (see Figure B18) consists of computer tests;
input and output electronics tests, sensor tests and servo tests. Entry into
this module comes from initialization and exit can occur either from a "de-
selection" of the ground test function or upon its completion. At the completion
of the test function sufficient information is available to ascertain the operational
status of the total system.
The procedure of ground testing makes use of a so-called "center out" test
philosophy wherein the most basic elements are tested first and then these basic
elements are used to test other functions. At the highest level, this test struc-
turing results in an organization which first testfe the computer LRU followed by
sensors and concluding with servo systems. The computer testing segment can
be further broken down into its constituent parts, i. e., central processor element,
RAM and ROM memories, and input/output elements.
b)) Sub-Processes:
COMTST
IOTST
SENSOR
SERVO
Computer Test
Input/Output Test
Sensor Test
Servo Test
c)
d) Data-Elements:
FST
TPT
SIT
— Failure Status Table
— Test Progress Table
— STP Input Table
e) Transition Diagram:
CTC.-CFD//EST.TPT /j/Q \ 1TC .-CFD//FST.TPT AENSOR\ /SERVO\ SRTC//FST-TP,Yc.11T
Conditions:
CFD
CTC
ITC
Critical Failure Detected
Computer Tests Complete
I/O Tests Complete
SNTC
SRTC
TCR
— Sensor Tests Complete
— Servo Tests Complete
— Test Continue Request
292
UJorr>i-o=3t£
COUJ
oot—l
UJa:±3293
Module: ARCS. GNDTST. PREMAN. TEST. COMTST: Computer Test Sequence
a) Description: Computer testing includes tests of the central processor, hardware
monitors, ROM memory, and RAM memory. The computer unit, in addition to
being the single most critical element in the ARCS control system, serves as the
test administrator for the rest of the System Test. Therefore its own integrity
must be verified before it can be called on to test peripheral systems. By similar
argument, the testing of the computer unit begins with a diagnostic of its central
processor unit (CPU).
Processor self-test performs an instruction test sequence which tests all instruc-
tions (or at least all micro-instructions), all registers, and all addressing modes.
The ability of each of the hardware monitors to detect and enunciate the existence
of a fault condition is next verified to assure that the system is free of latent fail-
ure conditions.
The ROM or constant memory will be tested using sum checking techniques wherein
all of the words in the program memo ry are summed together in one 16-bit word
ignoring any overflows that occur. The resultant sum is then compared to the
known constant 16-bit sum. RAM memory is checked by writing into and reading
out of all accessible RAM memory a predetermined data pattern and monitoring
the pattern read out as compared to the pattern which was written in.
c)
d) Data-Elements:
FST — Failure Status Table
TPT — Test Progress Table
e) Transition Diagram: 
f) Conditions: 
CFD - Critical Failure Detected 
295 
Module: ARCS. GNDTST. PREMAN. TEST. IOTST: Input/Output Test
a) Description: Analog, discrete and digital inputs and outputs are tested utilizing
what is commonly referred to as wraparound testing whereby a specific output
is commanded by the processor and the resultant output looped back into the
corresponding input processor. The received data can then be compared to
what was transmitted to check"both input and output functions.
296
Module: ARCS. GNDTST. PREMAN. TEST. SENSOR: Sensor Testing
a) Description: To verify the operational integrity of the ARCS sensor system
the sensor testing process involves stimulating the sensor outputs to produce
some expected output value. Wherever possible the sensor's own internal
selftest is used to this output by setting the appropriate computer output discrete
to cause automatic sensor selftest stimulation. When automatic stimulation is
not possible, the needed sensor output is generated by the test operator. This is
accomplished by formatting and displaying a cue to the test operator indicating
the required action.
For sensor systems, particularly analog equipment, the expected response time
to selftest stimulation may require wait times on the order of several seconds.
For this reason the sensor testing portion of the ARCS ground test operation is
a parallel operation where testing of several systems may be in process simul-
taneously. The test program for each sensor must therefore be broken down
into a series of subtasks each of which can be completed within its allocated
portion of the total frame time.
297
Module: ARCS. GNDTST. PREMAN. TEST. SERVO: Servo Testing
a) Description: Servo testing (see Figure B19) includes synchronization, engagement
control testing, dynamic response tests and force override tests. Upon comple-
tion of sensor testing the local test completion status will be communicated cross
channel and the test program will move on into the display portion. Servo testing
is one test area which necessarily requires a coordinated approach between all
operable computers. Testing will not proceed until it is ascertained that all
operable systems have completed sensor testing and are now ready to proceed.
At this point the display function shall cue the operator that the system is ready
to commence with servo testing and request further direction. The System Test
function will enter a wait loop while monitoring the STP input table for operator
inputs.
The first task to be accomplished in servo testing is to establish a regular time
reference to satisfy watchdog monitor requirements. This is effected by each
processor initiating a new iteration timer count and establishing a reference time
for the watchdog monitor, upon receipt of a test continue request from the STP.
Thereafter, whenever an iteration timer interrupt occurs, a synchronization process
similar to the normal real time synchronization will occur.
b)
c)
d) Data-Elements:
DIT — Discrete Input Table
DOT — Discrete Output Table
FST ' — Failure Status Table
TPT — Test Progress Table
LUDC
P
3iQCOUJC/DGQUJQ:299
e) Transition Diagram:
Conditions:
SAE
TEI
ECC
DRC
Servo Actuator Engaged
Triplex Configuration
Engagement Control Complete
Dynamic Response Complete
300
Module: ARCS. GNDTST. PREMAN. DISPLY: Display
a) Description: The display funetJoip(see EfguKes B20)and B21) is comprised
ot three processes, failure data processing, cross-channel communication,
and STP processing. Its purposes are to compile failure data generated by the
Initialization and Testing functions, present them to the operator in a meaningful
useful manner, accept request inputs received from the operator via the STP,
and direct the testing sequence accordingly.
Failure processing consists of the accumulation and formulation of test progress
and failure data from the various tables used by the ground test function, and
deducing the information to be communicated outside. Upon entry into the fail-
ure processing state an assessment must be made as to what was the condition
detected that initiated the control transfer to determine the appropriate response.
The response includes formatting the information for cross-channel transmission
aad /or display on the STP.
The state identified as Cross Channel Results is the vehicle by which the test
results, as derived by the local computer, are communicated to all oMer
operable computers. The state identified as STP processor performs the two
tasks required to interface the System Test function with the test operator -
format and transmit display data and process incoming operator commands.
ORIGINAL PAGE JS
OF POOR QUALITY
301
•z
.
u
_
a
.
•—i
QODQo;3
302
U
J
CO>-J
CO
 UJ
•
 
2
S
fC
©
a
.
-^
 1
-
i
•
*
,
STP
OUTPUT
BUFFER
IDISPLAY
PROCESSOR
t_j•z.zu
 o
:
X
I
-
tfAILUREI i'ATA
>!
 !\SSiSSMtNT
11
1
-
 
t
ce
 
-
<
 
0
.0
3
 
1
-2
CO
 
•
-
1
>
-
cc1-
^
-
_
 
Z
"^
 
U
J1UJa:1
GROUND TEST
PROGRAM .
o;
-
 iii
3
 U
.
L
 U
_
'
r
 3
-C
Q
rcK^zoCJ1
X
*
•I
»
,
 
Jj
1
i
r
 
r
a
:
UJa.
oa
.
toCOUJOa:
cvi
e
aUJ
B. 1. 4. 2 Background Tests
The background tests provide in-flight testing which is used to help localize
faults and to provide maintenance information.
304
Module: ARCS. RELTIM. BAKGND: Background Tasks
a) Description: The background tasks consist of background entry control, on-line
self-test, maintenance data update, and system test panel (STP) processing.
During any particular frame, the first operations to be executed will be the fore-
ground tasks required for that frame. When the foreground tasks are completed
the remaining time in the frame will be used to execute background tasks.
In-flight (or on-line) testing of the computer subsystem is the primary function
performed by the Background Task. Its purpose is to detect and/or localize
those failure conditions which might not be detected by first level system monitors.
The in-flight test is a continuous, repetitive check of the computer subsystem which
keeps a running record of anomalies within the local computer.
Background entry control serves as the task scheduler and directs control to the
particular function to be executed next. Maintenance data update gathers all
system failure data for maintenance purposes. STP processing serves as the
interfacing link outputting display data and accepting operator requests.
b) Sub-Processes:
ENTCONT — Background entry control
ONLINE — On-line self-test
MANDAT — Maintenance data update
STPPO — STP processing
c)
d) Data-Elements:
SI - STP Input Table
MT — Mode Table
TI — Timer Interrupt
SST — System Status Table
MFT — Maintenance Failure Table
305
e) Transition Diagram:
f) Conditions:
GTR
AOG
RTR
STC
DUG
SPC
Ground Test Requested
Aircraft on Ground
Real Time Required
Self-Test Complete
Data Update Complete
STP Processing Complete
306
Module: ARCS. RELTIM. BAKGND. ENTCON: Background Entry Control
a) Description: Background entry control acts as the task scheduler and directs
control to the particular function to be executed next, based on the last instruc-
tion executed in the previous f rame. Depending on the nature of the particular
test being executed, it may be possible to simply start up on the next instruction
in sequence or it may be necessary to backtrack to the start of the test and begin
again on the next frame. The entry state in addition to scheduling the background
testing sequence must also include a deadline timer to provide assurance that the
background tasks are being executed on a regular basis. The deadline timer con-
sists of a frame counter which gets incremented every time the background tasks
are entered. The counter must be periodically reset by the background test
program or else a counter overflow will occur and affault condition indicated.
307'
Module: ARCS. RELTIM. BAKGND. ONLINE: On Line Test
a) Description: The three processes involved in on-line self-testing are: (1)
processor self-test/diagnostics, (2) memory testing, and (3) input and output testing.
Processor self-test is the primary software self-test for the computer unit.
Although it runs in the background mode, it must be completed repetitively as
determined by the time deadline, i. e., a maximum time is specified between
sequential completions of the test. The computer self-test begins with an instruc-
tion test sequence within which all instructions are exercised, all registers involved,
and all addressing modes used. If the processor performed an erroneous compu-
tation and goes into one of these endless loops the time deadline for the computer
self-test would not be met.
The second test to be performed in background mode is a program memory sum
check, wherein all of the words in the program memory are summed and com-
pared to the known correct sum.
The third test in the computer self-test sequence is a scratchpad read-write test.
A number of locations in the scratchpad are dedicated to self-testing. On suc-
cessive iterations of the test, random patterns are written into these dedicated
locations and then checked.
Proper operation of the computer input and output sections is tested using wrap-
around loop checks of both analog and discrete data.
308
Module: ARCS. RELTIM. BAKGND. MANDAT: Maintenance Data Update
a) Description: Maintenance data update functions as a central collection point
of all failure data for maintenance purposes. Here all of the failure information
accumulated during the previous in-flight test loop is assessed along with the
failure status of the system monitors, i. e., SSFD, computer output monitor, and
servo monitor. From all of the available information, the maintenance update
state will resolve the LRU location in which a failure has occurred and, if the
failure has not already been registered, record it for future use by maintenance
personnel in a non-volatile section of memory. The maintenance update state
will then clear the in-flight test fault record and the deadline timer and transfer
control to STP processing for another iteration around the test loop.
309
Module: ARCS. RELTIM. BAKGND. STPRO: System Test Panel Processing
a) Description: STP input processing consists of checking for a device controller
interrupt flag and, when it is set reading the corresponding request. Only two
STP requests will be acknowledged during background tasks. One is a read
request which will cause the current system status to be displayed, and the other
is a ground test request which will be acknowledged only when aircraft status
information indicates an "on ground" condition. If ground test is called for,
program control will transfer out of background, out of real-time and into the
ground test program. If the STP device controller interrupt flag is not set,
control will immediately transfer to in-line testing.
STP output processing only gets executed following a read request received by
the STP input processor. Under this circumstance the appropriate failure
status information will be formatted and sent to the STP device controller for
transmittal to the panel where it is displayed to the operator.
310
[APPENDIX c
FUNCTIONAL LISTING OF MCP-701A INSTRUCTIONS
LOAD/STORE INSTRUCTIONS
Mnemonic
LDU
LDUS
LULB
LURB
LDL
LDLS
LLLB
LLRB
LDA
LDAS
LALB
LARB
LDB
LDBS
LBLB
LBRB
LDC
LDCS
LCLB
LCRB
STU
STUS
STLS
STAS
STBS
STCS
Instruction Description
Load UR from Program Memory
Load UR from Scratchpad Memory
Load UR (Left Byte) Immediate
Load UR (Right Byte) Immediate
Load LR from Program Memory
Load LR from Scratchpad Memory
Load LR (Left Byte) Immediate
Load LR (Right Byte) Immediate
Load XA from Program Memory
Load XA from Scratchpad Memory
Load XA (Left Byte) Immediate
Load XA (Right Byte) Immediate
Load XB from Program Memory
Load XB from Scratchpad Memory
Load XB (Left Byte) Immediate
Load XB (Right Byte) Immediate
Load XC from Program Memory
Load XC from Scratchpad Memory
Load XC (Left Byte) Immediate
Load XC (Right Byte) Immediate
Store UR into Program Memory
Store UR into Scratchpad Memory
Store LR into Scratchpad Memory
Store XA into Scratchpad Memory
Store XB into Scratchpad Memory
Store XC into Scratchpad Memory
Opcode
40XX
7CXX
OOXX
05XX
42XX
80XX
01XX
06XX
46XX
88XX
02XX
07XX
48XX
8CXX
03XX
08XX
4AXX
90XX
04XX
09XX
4CXX
94XX
98XX
AOXX
A4XX
A8XX
Instruction
Format
1
2
4
4
1
2
4
4
1
2
4
4
1
2
4
4
1
2
4
4
1
2
2
2
2
2
Execution
Time
2.0
1.5
1.25
1.25
2.0
1.5
1.25
1.25
2.0
1.5
1.25
1.25
2.0
1.5
1.25
1.25
2.0
1.5
1.25
1.25
2.0
1.75
1.75
1.75
1.75
1.75
ARITHMETIC INSTRUCTIONS
Mnemonic
ADU
ADUS
ADBU
ADBL
ADBA
ADBB
ADBC
AMS
DIV
DIVS
MPY
MPYS
SBU
SBUS
Instruction Description
Add to UR from Program Memory
Add to UR from Scratchpad Memory
Add to UR (Right Byte) Immediate
Add to LR (Right Byte) Immediate
Add to XA (Right Byte) Immediate
Add to XB (Right Byte) Immediate
Add to XC (Right Byte) Immediate
Add to Scratchpad Memory From UR
Divide UR & LR by Program Memory
Divide UR & LR by Scratchpad Memory
Multiply UR by Program Memory
Multiply UR by Scratchpad Memory
Subtract Program Memory from UR
Subtract Scratchpad Memory from UR
Opcode
4EXX
ACXX .
OAXX
OBXX
OCXX
ODXX
OEXX
B4XX
58XX
C8XX
56XX
C4XX
52XX
BCXX
Instruction
Format
1
2
4
4
4
4
4
2
1
2
1
2
1
2
Execution
Time
2.0
1.5
1.25
1.25
1.25
1.25
1.25
2.5
10.75
10.5
6.0
5.75
2.0
1.5
311! .
^/
REGISTER INSTRUCTIONS
Mnemonic
ABSU
CILB
CIRB
CPLU
INV
SILB
SIRB
TSU
TUS
XUA
XUB
XUC
XUL
Instruction Description
Absolute Value of UR
Clear Indicator (Left Byte)- Immediate
Clear Indicator (Right Byte)-Immediate
Complement UR
Invert UR
Set Indicator (Left Byte)- Immediate
Set Indicator (Right Byte)- Immediate
Transfer SR to UR
Transfer UR to SR
Exchange UR and XA
Exchange UR and XB
Exchange UR and XC
Exchange UR and LR
Opcode
19XX
1FXX
20XX
17XX
16XX
1DXX
1EXX
13XX
14XX
10XX
11XX
12XX
OFXX
Instruction
Format
4
4
4
4
4
4
4
4
4
4
4
4
4
Execution
Time
1.25- 1.75
1.25
1.25
1.5
1.25
1.25
1.25
1.25
1.25
1.75
1.75
1.75
1.75
INPUT/OUTPUT INSTRUCTIONS
Mnemonic
CLR
CLRI
ENBL
INHB
Instruction Description
Clear Device Controller
Clear Interrupt Specified
Enable Interrupts from Device
Inhibit Interrupts from Device
Opcode
E8XX
23XX
ECXX
FOXX
Instruction
Format
2
. 4
2
2
Execution
Time
1.25
1.25
1.25
1.25
SHIFT INSTRUCTIONS
Mnemonic
SLZ
SLZD
SLZX
SRC
SRCD
SRS
SRSD
SRSX
SRZ
SRZD
• Instruction Description
Shift UR Left-Enter Zeros
Shift Double Left-Enter Zeros
Shift Double Left by XC-Enter Zeros
Shift UR Right - Circulate Bits
Shift Double Right - Circulate Bits
Shift UR Right-Repeat Sign
Shift Double Right- Repeat Sign
Shift Double Right By XC - Repeat Sign
Shift UR Right-Enter Zeros
Shift Double Right -Enter Zeros
Opcode
2JJXX
29XX
2DXX
2AXX
2BXX
24XX
25XX
2CXX
26XX
27XX
Instruction
Format
4
4
4
4
4
4
4
4
4
4
Execution
Time
1.25 +.25(n)
1.25 + .25(n)
1.5 + .25(n)
1.25 -i- .25(n)
1.25 + .25(n )
1.25 + .25(n)
1.25 + .25 (n )
1.5 + .25(n)
1.25 + . 25(n)
1.25 -t- .25(n)
DOUBLE PRECISION INSTRUCTIONS
Mnemonic
ADD
ADDS
ADMS
LDD
LDDS
STDS
SBD
SBDS
ABSD
CPLD
ZRD
NRM
Instruction Description
Add Double from Program Memory
Add Double from Scratchpad Memory
Add Double to Scratchpad Memory
Load Double from Program Memory
Load Double from Scratchpad Memory
Store Double into Scratchpad Memory
Subtract Double from Program Memory
Subtract Double from Scratchpad Memory
Absolute Value of Double Register
Complement Double Register
Zero Double Register
Normalize Double Register
Opcode
50XX
BOXX
B8XX
44XX
84XX
9CXX
54XX
COXX
1AXX
18XX
15XX
2EXX
Instruction
Format
1
2
2
1
2
2
1
2
4
4
4
4
Execution
Time
3.0
2.5
4.25
3.0
2.5
3.0
3.0
2.5
1.25 - 2.25
1.75 - 2.0
1.5
2.0 + .25(n)
LOGICAL INSTRUCTIONS
Mnemonic
NDU
NDUS
ORU
ORUS
CBSP
SBSP
SKSP
Instruction Description
And to UR from Program Memory
And to UR from Scratchpad Memory
OR to UR from Program Memory
OR to UR from Scratchpad Memory
Clear Bits Specified by Bit Mask
Set Bits Specified by Bit Mask
Skip on Bits Specified by Bit Mask
Opcode
5AXX
CCXX
5CXX
DOXX
78XX
74XX
70XX
Instruction
Format
1
2
1
2
3
3
3
Execution
Time
2.0
1.5
2.0
1.5
2.75
2.75
2.0 - 2.25
BRANCHING INSTRUCTIONS
Mnemonic
DSSZ
JINT
JMP
JMPI
JMS
JMSI
JSNS
RTN
RINT
Sffi
SISE
SIG
SISG
SIL
SISL
SKLB
SKRB
SKR
Instruction Description
Decrement and Skip if Scratchpad is Zero
Jump to Service Interrupt
Jump Unconditional
Jump Unconditional, Indirect
Jump to Subroutine
Jump to Subroutine , Indirect
Jump After Device Sense
Return from Subroutine
Return from Interrupt Routine
Skip If Program Memory Equal to UR
Skip If Scratchpad Memory Equal to UR
Skip If Program Memory Greater than UR
Skip If Scratchpad Memory Greater than UR
Skip If Program Memory, Less than UR
Skip If Scratchpad Memory Less than UR
Skip on Indicator (Left Byte) - Immediate
Skip on Indicator (Right Byte) - Immediate
Skip if Device is Ready
Opcode
ECXX
21XX
64XX
66XX
68XX
6AXX
F4XX
2FXX
22XX
60XX
D8XX
5EXX
D4XX
62XX
DCXX
1BXX
1CXX
E4XX
Instruction
Format
2
4
1
1
1
1
2
4
4
1
2
1
2
1
2
4
4
2
Execution
Time
2.5 - 2.75
8.0
1.5
2.0
1.5
2.0
2.75
1.0
5.0
2.0- 2.2
1.75 - 2.0
2.0 - 2.2
1.75 - 2.0
2.0 - 2.2
1.75- 2.0
1.25- 1.5
1.25 - 1.5
1.75 - 2.0
APPENDIX D
ARCS HARDWARE CONFIGURATION RATIONALES
This appendix contains the design rationale for the major configuration
decisions involved with the candidate ARCS hardware architecture. In most
cases, supporting rationale has been developed as a result of trade-off
evaluations. In some cases, the rationale follows essentially from previous
experience with fault tolerant digital systems. The following is an itemized
listing of the key tradeoff areas considered, with a brief summary of imple-
mentation options:
• Processor Functional Characteristics
a) Advanced processor which is improved by DOT experience
and other application evaluations.
• Electronics Packaging
" \ } a) Single LRU per system
•-"••*
b) Single LRU per channel
c) Separate computer and interface LRU's per channel.
• Cross-Channel Data Link
a) Autonomous operation in DMA mode at transmitter and receiver.
b) Processor controlled transmitter with DMA receiver.
c) Processor controlled transmitter with dual mode receiver - DMA
for normal mode and interrupt driven for recovery mode.
d) Dedicated, one-way independent serial or parallel busses.
,J e) Non-dedicated, two-way serial or parallel busses.
• Level and Method of Synchronization (all options assume, software
sensor selection is required and is implemented)
a) Hardware methods resulting in bit identical, bit or frame synchro-
nous processing.
b) Software methods resulting in bit identical, frame synchronous
processing.
-j c) Software methods resulting in bit similar, frame synchronous pro-
cessing - low gain equalization around path integrators required
(following sensor selection).
d) Asynchronous processing - with equalization as in (c)
,314
(cont'd)
Q
• Watchdog Monitor
a) Simple analog pulse-width monitor
b) Precise digital timer
• Servo Monitoring
(all options assume the GE candidate servo actuator)
a) Independent, dedicated hardware monitors, with dual servo loop
electronics per channel, which use cross-channel comparison
and failure logic.
b) Same as (a) but no cross-channel comparisons in hardware form;
all cross-channel servo monitoring in software.
• Non- Volatile Memory
a) Electromagnetic core
b) CMOS RAM with battery
c) MNOS memory
The discussion in the following subsections covers the major ARCS
hardware architecture decisions in each of the listed trade-off areas.
1.0 PROGESSOR^FUNG TIONAL CaARAGTEKISTICS
In evaluating the desirable and undesirable features of the MCP-701
central processing unit, the DOT experience and usage provided the judge-
ment basis. The major undesirable features of the MCP-701 equipment
were:
• Large overhead burden associated with interrupt initiated input output
processing.
• Significant time required for DMA input output processing.
• Inability to use read only memories in place of core random access
memories.
The desirable features of the MCP-701 related equipment were:
• A varied and powerful instruction repetoire, such as the one used by
the MCP-701, has proved to be well suited to the computational needs
of the real-time control.
• A 16-bit standard and 32-bit double precision word size in a fixed-point
..s machine has been found adequate for control computation.
315
(cont'd)
• Inclusion of dedicated hardware fault detection in critical functional
areas (memory, arithmetic section, etc.) proved to be vital in demon-
strating and maintaining correct system performance.
• Modularized, autonomous operation of the CPU and memory, CPU
and I/O, and the I/O and memory, has been found to be both useful
and efficient, especially for the purposes of automatic testing and
fault isolation.
• Flexible input output system capable of dealing with both computer
controlled and interrupt driven devices was a necessity.
A detailed analysis on instruction usage was conducted. The DOT applica-
tions indicated a typical control processor instruction mix as: Load/Store -
40%, Multiply - 3%, Divide - . 5%, (other) Arithmetic - 5. 5%, Logical - 5%,
Branch - 13%, Non Memory Reference - 32%.
Using this mix to evaluate computer throughput shows a faster machine than
as predicted by other often used mix equations. Figure D-l shows through-
put of the MCP-701 as predicted by three methods.
The detailed analysis work performed was of great benefits in suggesting
changes to the architecture and organization of the MCP-701. The specific
changes involved:
• Architecture/Opcode Modifications
• Memory Organization/Interface
• Input Output Organization/Interface
The opcode and architecture modifications were a direct result of the opcode
usage survey. It was determined that:
• Use of multiple index registers would be very beneficial.
• Given the presence of multiple index registers, the usefulness of
indirect addressing was limited.
• Displacement addressing (plus or minus) would be a more efficient
instruction addressing mode. It would tend to make the page boundaries
become transparent.
316
857, ADD, 10% MPY, 5% DIV 370KOPS
ADD, 18% BRANCH, 5% MPY, 1% DIV, 8% MISC 450KOPS
DOT EXPERIENCE 430KOPS
300KOPS 400KOPS 500KOPS
FIGURE Dl? MCP-701 THROUGHPUT PREDICTION
317
377
1,.'0. - (cont'd)
• Little used or unused opcodes were eliminated from the instruction
repertoire.
• New instructions were suggested for the instruction set (such as
immediate instructions).
• "Macro" instructions were suggested to minimize the interrupt overhead
burden.
• A more useful set of BIT manipulation instructions were suggested.
• A masking capability over all priority interrupt was suggested.
• The status register was expanded from 8 to 16 bits.
A new memory organization/interface was defined. It maintains independence
between program and variable memory:
• A separate, independent addressing and control structure is provided
for program and variable memories.
• Program memory may physically be in the form of semiconductor read
only (ROM), programmable read only (PROM), or Core.
• Complete timing and addressing compatibility is provided between the
various program memory forms.
• A mix of the program memo ry types is allowable.
• A semiconductor random access memory (RAM) is to be provided for
variable storage.
• The semiconductor RAM is to be present regardless of the form of the
program memory.
The above guidelines allow great flexibility in expanding the memory size.
They allow a "zero software change" if PROM's or ROM's replace all or
part of a Core memory. The separate program and variable memories
eliminate all time burden associated with traditional direct memory access
operations.
The input output organizational changes are consistent with the memory and
CPU architecture modifications. It was determined that the input output
structure be imbedded in the variable memory structure. The concept of
the directly operable input output (DOIO) evolved. The input output (DOIO)
has the features:
318
1.0 (cont'd)
• Input output devices are addressed as if they were scratchpad memory
locations.
• The CPU may perform arithmetic, logical, or branching operations
directly on the I/O data.
• A significant increase in computational throughput results since.data
does not have to be software tr-ansferred to main line memory before
manipulation.
• Input output devices may communicate with the CPU under processor
control (PC), interrupt initiation (DVC), or direct memory access (DMA)
modes of operation.
The overall suggested changes defined an organization which was modular
and expandable. They defined an architecture which is consistent with
various redundancy levels, is flexible to permit degraded redundant per-
formance and is suitable for auto restart or.recovery techniques.
The baseline ARCS processor was evaluated against other aerospace pro-
cessors using a benchmarking technique. Improvements to the MCP-701
instruction set, addressing structure, and I/O system were systematically
evaluated. The effectiveness of the ARCS processor is illustrated by the
performance comparisons shown in Figures D-2 andD-3. Figure D-2 is a
throughput comparison (using a conservative throughput mix equation),
and Figure D-3 shows the input output processing time overhead burden.
The combination of the directly operable input output (DOIO) with the high
throughput of the processor makes this processor ideally suited to the needs
of real time, redundant, recovery oriented, control applications.
The current implementation of the ARCS processor utilizes state of the art
medium scale integration (MSI) and large scale integration (LSI) devices.
They are low in power, currently available, and operate over the temperature
and environmental range required for airborne applications. This implemen-
tation is what is expected to be used in a near term and intermediate term
production aircraft.
CO
Cu
o
CJ
LLJ
CO
o:
LU
CL
o
coQ
co
r3
o
cc
500'
400
300
200
100 4
CL
X
O
cr
THROUGHPUT CLACULATED FROM
85% ADD, 10% MULTIPLY, AND
5% DIVIDE MIX EQUATION USING
MUMORY REFERENCE INSTRUCTIONS,
425
370 370
310
X
10
<c
eni—i
oo
LUD_
OO
(-D
i—I
I
D_
I
Q_
OO
iQ_
cvo
CQ LU
PROCESSORS
FIGURE D2- AEROSPACE PROCESSOR THROUGHTPUT COMPARISON
i3!0 j
oISO-
125-
100
LU
Q
o:
CQ
Q
o:
LU
b
50-
25-
OVERHEAD BURDEN = Tp-Ts
Tp = TOTAL I/O INTERRUPT PROCESSING TIME
TV = ACTUAL TIME SPEND SERVICING I/O DEVICE
<C —
LO f>
•—i x.«—i JCD UD | CL.
I I
c > n
C3 SI UJ
FIGURE $3 AEROSPACE PROCESSOR INPUT/OUTPUT OVERHEAD BURDEN
321,
1.0
2.0
(cont'd)
The current trends toward microprocessors and n-bit slice micro-controller
technology are deemed useful for the far term production aircraft. In their
current form such designs have the following limitations:
• Environmental temperature range
• Limited throughput
• Limited or non existent second source availability
It is anticipated that the far term ARCS implementation will use such devices.
The functional definition and system architecture may remain substantially
intact. Little or no change in redundancy configuration or reconfiguration
strategy can at this time be predicted as a result of this new technology.
The approaches developed in using the near term ARCS implementation will
also be the basis' for the far term implementation.
ELEeT-RONIGS^PACK-AGING ,
The possibility of packaging all ARCS electronics in a single LRU per system
is rejected because:
• The single LRU would (at least in the near term) exceed standard ATR
package size requirements for commercial transports.
In flight-critical and flight-crucial applications the vulnerability of a
single 11 LRU 11 to \ |phy sical | [ and | [electrical || damage [is ([inconsistent ||with | [flight
safety requirements.
The major trade-off in the packaging area concerns a single LRU per channel
versus separate computer and interface units per channel. The primary
factors which influence the trade are:
• Maintenance and logistics (cost-of-ownership)
• Functional reliability
• Hardware design
Assuming that the single LRU achieves an overall complexity factor that
is less than twice the complexity of either of the separate computer or inter-
face units - at less than the sum of their costs - then the cost-of-ownership
322'
2-0 (cont'd)
trade should clearly favor the single LRU approach. Since this is a reason-
able assumption, the motivation for considering separate computer and
interface units must be based upon functional reliability or hardware design
benefits, not cost-of-ownership benefits.
From a reliability standpoint, there is one potential advantage for the
separate unit approach. If it is possible to achieve autonomous operation
of the interface unit, and sensor selection is employed in the computer units,
then there is a true voting node between the units (relative to system inputs).
In order to achieve autonomous operation of the interface units - in a system
where bit identical, frame synchronous processing is required - it is neces-
sary to supply the interface units with frame sync information. Unless
hardware or software voters are used in the interface units, it is impossible
to transfer frame sync information from the computer units without creating
functional dependence. The addition of hardware voters fixes the redundant
structure, and consequently eliminates application flexibility. The addition
of software voters requires the addition of a processor to the interface units.
Without developing quantitative reliability data, it appears that there is no
meaningful reliability benefit which results from the separate unit approach,
unless hardware voters are added. Such hardware is considered unacceptable
under the ARCS development ground rules. Further, even if voters could be
added for frame sync information, it is doubtful whether the net reliability
gain, after adding a new power supply and timing structure for the separate
interface unit, would be significant.
From the point-of-view of hardware design, the single LRU approach is by
far the simplest approach functionally, and in terms of the amount of I/O
hardware required. Duplicate power supplies and local timing structures
are eliminated. The possibility of needing duplicate buffer memories is
completely eliminated. Further, with the DOIO concept the total interface
multiplexing function is effectively eliminated in the single LRU approach.
j 323
3*3
2.0 (cont'd)
Having all I/O data sources available on the computer bus structure, as
RAM memory locations, is a major advantage of DOIO which greatly sim-
plifies software.
For all of the reasons just discussed, the single LRU per channel approach
was selected for the candidate ARCS architecture.
3.0 CROSS-CHANNEL DATA LINK-TRADE-OFFS
One of the most important trade study areas for ARCS is the definition of
the interchannel data link structure. This interchannel data link will be
used under normal operation to exchange sensor data for selection, com-
puted outputs and servo position data for monitoring and check words for
data link integrity checking. It will also be used to exchange computer state
vectors for recovery. It is important that' the software involved in com-
municating through the link be minimal under normal and recovery conditions,
and that the link provide a high enough data rate so that flexibility is allowed
for developing executive strategy. At the same time, a very substantial
portion of the computer electronics may be involved in the interchannel
data link so careful justification of all requirements is mandatory.
3.1 DATA TRANSFER. REQUIREMENTS
The cross-channel data transfer requirements involve two factors: the
total number of distinct data words to be exchanged (without regard for
timing), and the maximum number of data words which must be exchanged
for any one minor cycle of computation. Based upon Boeing's application
models and the currently available software estimates; the following list
indicates the maximum number of distinct data words to be exchanged.
Normal mode SSFD 33 words
Recovery mode SSFD 429 words
Normal and recovery mode control
law state vector 120 words
Test and miscellaneous 5 words
Total 587 words
324
3.1 (cont'd)
These requirements reflect the use of dedicated sensor interfaces and a
particular Boeing-derived sensor selection and failure detection (SSFD)
algorithm. During each minor cycle of computation, only those words needed
for that cycle must be exchanged cross-channel. Different combinations of
data words may be exchanged during each minor cycle according to the solu-
tion rate requirements for the various program segments. The minor cycle
time is currently planned to be 10 msec (corresponding to 100 solutions
per second). While the exact executive time scheduling has not been deter-
mined, it is not expected that the rate of cross-channel data exchange will
exceed 64 words in one minor cycle.
At the currently established data rate of 19. 5 p sec/word, the transmission
of 64 words would require 1.28 msec. Even with the software overhead
associated with loading the transmitter LIFO, the entire 64 word trans-
mission would be completed in approximately 1. 5 msec. With appropriate
executive scheduling within a minor cycle, it is possible to overlay other
functions during the actual transmission of data. Since data is received
and stored in a DMA mode at each receiver, the relative time burden in a
10 msec minor cycle may be made minimal (as little as 0. 25 msec).
The provision of a 10-bit address label on each data word, and a correspond-
ing 1024 word buffer memory at each receiver, is more than adequate to
handle the 587 word requirement.
3
 •
2
 'CROSS^CHANNE L I/O ARCHITECTURE CONSIDERATIONS
In formulating the structure for the cross-channel data transmitter and
receiver, the major constraints to be satisfied were:
• To provide a cross-channel communication capability which would
satisfy all of the normal mode exchange requirements yet minimize the
transmitter and receiver overhead burdens associated with the cross-
channel process. This capability must be such that bit identical sensor
selection and output selection can be accomplished.
3.2 (cont'd)
• To provide a cross-channel communication capability which would satisfy
the large data requirements potentially required for a recovery process.
The CPU burden for the transmitter must again be minimized such that
it is not required to halt real time com putation. It is assumed the
receiver CPU burden is insignificant since it is in a recovery process.
The data requirements such that bit identical selection of sensor and output
signals might be accomplished were discussed in. Section 3.1. The desire to
minimize the CPU burden for both the transmitter and receiver during normal
operation and for the transmitter during a recovery operation will be discussed
in this section. Three techniques were examined for minimizing CPU burden.
They were:
• Direct Memory Access (DMA) operations
• Processor Controlled (PC) operations
• iUse of MCP-701A application dependent "macro" opcodes.
The results are presented below.
3.2.1 Elimination of CPU Burden During Transmission
A concept to provide great flexibility would be to have the cross-channel
transmitter as a processor controlled device. CPU instructions would then
be required to transfer the data from one I/O zone to the cross-channel
transmitter. If a looping subroutine were used for this purpose, and 64
variables were to be transmitted, the software routine would consume about
1.2 milliseconds. This is certainly not minimizing CPU burden.
The selected concept utilizes an MCP-701A application dependent opcode.
It is a solution which minimizes the CPU burden yet allows the CPU great
flexibility in selecting the source and quantity of data to be cross-channeled.
This "macro" opcode would be defined as:
Mnemonic: MOVE - Block Transfer Scratchpad Memory
Description: Data is transferred in a block of XC words from one
scratchpad memory zone to another. Index register
3.2.1 (cont'd)
B (XB) specifies the source address for the first
word. Index register A (XA) specified the destina-
tion address for the first word. Index register C (XC)
specifies the number of words to be moved in a block.
Data is transferred in order of descending address.
Thus, the source zone corresponds to locations XB
through XB - XC, and the destination zone corresponds
to locations XA through XA - XC. There are no address
mode variations allowed with this instruction.
When used to load the LIFO in the cross-channel trans-
mitter, the MOVE instruction is used as described in
subsection 3.2.3.1.3. The execution time for the
instruction is determined as follows:
ET = 1.5 + 2.5 (XC) psec
For a block transfer of 64 words into the LIFO trans-
mitter, the execution time is 161. 5 y sec.
The MOVE opcode reduces the processor overhead
burden by more than 7:1 for cross-channel transmissions.
Further, it minimizes the hardware complexity of the
transmitter. Flexibility is inherent in the opcode since
the programmer may select the block size and starting
address. A series of MOVE instructions may be pro-
grammed to select data from multiple source zones.
3.2.2 Elimination of CPU Burden While Receiving
The primary consideration here involves the receipt of recovery mode data
as opposed to normal mode data. Within the DOIO structure of the MCP-701A,
the DMA mode of operation for the receiver is the clear choice to minimize
CPU burden. However, the amount of data required in the recovery mode
may exceed the size of the local scratchpad memory dedicated to each receiver.
To handle this problem, one approach would be to allow the receiver to revert
to an interrupt initiated processor controlled mode. 327
3.2.2 (cont'd)
This approach has two disadvantages; one of which is judged to be fatal.
First, the CPU burden is increased duraing recovery, but this is of minor
importance, because the CPU has nothing else more pressing to do than
recover. The more serious difficulty is that using an interrupt initiated
receiver mode requires that Boeing requirement (c) in subsection 2.3 be
violated. That is, one computer channel must have the capability to interrupt
another. Because this is not allowed on ARCS, the decision was made to
utilize a 1024 word scratchpad RAM at each receiver in a DMA mode. The
CPU receiver burden is therefore zero, and no cross-channel interrupts are
needed.
3.2.3 Data Rate and Format
As indicated earlier, the selected cross-channel transmission rate is one
word every 19.5 ^i seconds. This rate is based upon the bit serial trans-
mission of 29 bits (16 data, 10 label, 2 sync, 1 parity) at a 2 MHz rate with
5. 0 y seconds word separation.
The word format is illustrated in Figure D-4. The signal format is a self-
clocking format derived for MIL-STD-1553. This signal format requires only
two wires per transmission line, and is compatible with AC coupled media,
such as certain fibre optic transmission systems employ.
3.3 SERIAL VERSUS PARALLEL TRANSMISSION
The considerations here .involve:
9 Hardware complexity and thus reliability.
• Data rate requirements such that a bit identical selection process
may occur.
• Recovery strategy
• Use of dedicated vs non-dedicated bus structures.
Section 3.1 discussed the data exchange requirements. A word serial, bit
serial transmission format is believed to be adequate for the indicated
requirements. The transmission of 64 variables would require only
328
10 16
LABEL DATA
t
SYNC
WORD FORMAT
FIGURE D4 CROSS-CHANNEL WORD FORMAT
t
PARITY
329
3.3 (cont'd)
1.25 milliseconds. As little as 0.162 milliseconds of CPU burden may
be associated with the transfer process as pointed out above. This allows the
CPU to pipeline or overlay other software executive functions.
Reliability considerations are obviously in favor of the serial transmissions.
Fewer parts are involved. Cabling and connector cost, and aircraft weight
considerations all favor the serial transmission format.
The bit and word serial format is consistent with the recovery considera-
tions and strategies presented below.
The trade-off concerning the use of independent, dedicated one-way busses,
versus any non-dedicated bus structure may be resolved, in most practical
implementations, by simply considering cable failure effects. In particular,
the effects of cable failures on sensor selection are of critical interest.
Figure D-5 illustrates the common type of "open" failure for two cases.
In case 1, the cable failure shown is a "single-point" failure whenever bit
identical sensor selection is employed. The reason for this is that with the
two-way busses a single "open" failure causes each channel to sensor select
from a different set of three sensor values (as shown in Figure D-5). The
independent, dedicated one-way bus structure shown as case 2 does not have
this problem. For this reason, and other similar failure effect reasons,
the case 2 bus structure has been selected for ARCS.
4.0 SYNCHRONIZATION TRADE-OFFS
One of the most basic decisions to be made in the design of a redundant
digital computing system such as ARCS concerns the level and method of
synchronization to be employed. Traditionally, this decision has involved
one or more of the following three considerations, and has been intimately
related to decisions concerning sensor selection and computer output
monitoring:
1330
(A/B/X) (A/B/C)
B
(X/B/C)
FAILURE OPEN
CASE 1 - NON-DEDICATED TWO-WAY BUSSES
(A/B/X)
A
(A/B/C) (A/B/C)
FAILURE OPEN
CASE 2 - INDEPENDENT/ DEDICATED ONE-WAY BUSSES
FIGURE D5 CABLE FAILURE EXAMPLES
331
4.0 (cont'd)
a) A desire to hold the input sample times and output update times for
corresponding signals in each channel within a specified tolerance
of each other.
b) A desire to eliminate all drifting of path integrators implemented in
the control laws and thus obviate the need for any cross channel
equalization.
c) A desire to have bit identical computer outputs for corresponding signals
in each channel.
These considerations are closely interrelated and involve a whole spectrum
of tradeoffs in both hardware and software areas. For example, if bit
identical outputs are desired, frame synchronization with sensor selection
or cross-channel.equalization is mandatory. Either hardware or software
or mixed implementations are possible for all of these functions. To compound
the tradeoff problem, the question of synchronization for the ARCS system
involves the further consideration of the requirements for recovery.
The following discussion considers various tradeoffs with the important
simplifying assumption that whatever level of synchronization is employed,
it is a software process. It is the recovery of a faulted computer to opera-
tional status which has the largest potential impact upon the level of syn-
chronization required.
332
4.0 (cont'd)
Several different recovery procedures have been hypothesized for providing I
transient fault tolerance within redundant computing systems. Each of
these recovery procedures addresses a certain class of transient faults and
carries implications for normal system operation in many areas. Certain
of these recovery procedures require substantial additional hardware beyond
that required for a nonrecoverable system, which leads to a lower basic
channel reliability, and thus, limits the practical advantages of incorporating
them. The following subsections detail the hardware and software considera-
tions involved in implementing the five recovery procedures entitled roll
ahead, roll back, memory copy, restart, and coasting.
4.1 SYNCHRONIZATION AND RECOVERY STRATEGY
Program roll ahead is defined by ultrasystems to be capable of effecting
"instantaneous" transient fault recovery under certain conditions. Accord-
ing to the definition, roll ahead involves the exchange of state variables
between each pair of redundant computers following the execution of each
program segment. Each program segment operates on an input state
vector to generate an output state vector, which is then used as the input
state for the next program segment. If the program memory is intact in the
computer which suffered the transient fault then recovery is possible.
333
4.1 (cont'd)
Recovery would simply involve having the faulted computer use the state
. vector from an unfaulted computer for the next and possibly several ensuing
program segments. However, the failed computer must be notified by
another computer (ultrasystems suggests an interchannel interrupt) to use
an unfailed computer's state variables for subsequent program segments, as
needed, until the faulted computer's state vector agrees with those of the
unfailed computers.
There are a substantial number of underlying assumptions behind the ultra-
systems roll ahead procedure, a few of which are
1) An interchannel data link structure exists between each pair of
computers which is capable of very high data exchange rates.
2) The application program is carefully segmented.
3) An unfailed computer channel can be trusted to interrupt the processes
of a faulted computer in a predictable manner and that spurious
interrupts generated by a faulted computer can be properly inhibited
from affecting the normal operation of an unfailed computer.
4) The variable portion of each computer memory is n (n-tuple redundancy)
times the size required for simplex operation, or, that the computers
are running in a frame synchronous mariner where each frame defines
a program segment.
5) Each computer verifies its program segment state vector against all
other computers' vectors at the end of each program segment and before
proceeding to the next segment.
6) There are relatively few elements in the state vector for each program
segment, or, that all computers are operating on identical data in a
frame synchronous manner, and are expected to produce the same
state vector, so that the comparison execution time is minimized.
l 3 3 4 j
4.1
 0.. (cont'd)
For the ARCS, assumption 1 above would dictate the inclusion of an inter-
computer parallel data bus structure. This has been judged to require too
much electronics for the ARCS. Assumption 3 violates a specific ARCS
reconfiguration principle ruling out such interchannel interrupts. Assump-
tion 4 significantly increases the size and power requirements for each
computer in such a system. Assumption 2 is typical of real time flight
control problems, and assumptions 5 and 6 fit easily within the anticipated
operational scenario of the ARCS.
The basic notion of roll ahead, state vector exchange between an unfailed
computer and one which has been judged to be transiently failed, is a
sound procedure, and the ARCS has been defined to use this procedure in a
modified version as one of the primary recovery mechanisms.
Roll back simply involves the recomputation of a specific program segment
if the interchannel state vector comparison operation at end of that segment
indicated an error in that computer. Roll back as defined by ultrasystems
involves all of the assumptions indicated under roll ahead. In addition, the
use of roll back as a recovery procedure dictates that all application program
segments be written so as to preserve all state vector values, both from
the previous iteration and from the current iteration. This causes an effective
doubling of the size of the variable memory. Roll back is primarily applicable
to a relatively restricted set of transient faults, i. e., those where only the
computed results are corrupted. Roll back is primarily useful as a transient
fault isolation/recovery procedure when only two computers have been operat-
ing satisfactorily and a difference is detected between their results, but no
fault isolation is available from other sources. The ARCS hardware has been
configured to permit a roll back recovery procedure to be used provided
that appropriate procedures for use of the relatively low rate interchannel
data transfer are adopted.
1335 ,
33?
4.1 (cont'd)
Memory copy involves the exchange of the program and variable memory
contents between an unfailed computer and a failed computer in an effort
to recover from transient faults in the memory electronics which have
permanently altered the contents of the program memory. The ARCS has
been defined to use ROM or PROM program memory so there is no require-
ment for such a recovery procedure.
Restart simply involves re-initializing the program variables and develop-
ing a new program solution upon detection of an out-of-tolerance condition.
The ARCS has been defined to make use of this recovery procedure. Flight
control problems are closed loop computations and as such contain informa-
tion from a relatively short period of history (typically less than 2 minutes).
That is of course an impossibly long time for recovery of a computer unless
other unfailed computers are satisfactorily controlling the aircraft. It does
have the advantage that the operation of the unfailed computers need not be
disrupted while the failed computer is recovering. It also dictates that
something less than bit-for-bit identical results be established as the
monitoring criteria at the computed outputs.
4. 2 RECOVERY IMPACT ON NORMAL OPERATION
As indicated in the foregoing discussion of the various potential recovery
mechanisms, the inclusion of recovery as part of the system operational
scenario has a significant impact on many hardware and software system
characteristics. The ARCS recovery mechanism is a modified roll ahead
where the program segment output state vector is transferred across channel
throughout the computation of a program segment rather than waiting until the
end of the program segment computation. This approach is more compatible
with the LIFO driven cross-channel transmitter which is used with the ARCS
system than it would be with a conventional DMA mode cross-channel block
transfer. In addition, it permits the use of serial transmission with the
data rate of 19. 5 p sec per word as defined for the ARCS. This recovery
1336
4.2 (cont'd)
procedure does not dictate any level of synchronization for the ARCS since
•sufficient storage is allocated within the cross-channel receiver for unique
storage of all interchannel data including the recovery data.
The ARCS uses a frame synchronized computational scheme with all
computers performing the same computation during each frame. Since the
same sensor data is used by all computers under normal operation (on
executive strategy decision), this approach yields bit-for-bit identical com-
puter outputs.
4.3 SOFTWARErFRAMEn-SY'NG'HIl'OraZATION
Two basic algorithms to perform the synchronization task were investigated.
The first algorithm waited for all computers to be ready; the second
algorithm selected one computer as the master and the others followed it.
To prevent any possibility of the system hanging up on first failure, all the
wait loops for both algorithms have a time out counter included in their
mechanization.
In all of the methods studied, defined by the table of Figure D-6, the three
sync discretes are assumed to be logic indicators LI, L2, and L3 in the status
register of the CPU. Each computer can set one of the indicators and test
all three to monitor the state of the computers. Two schemes for intercon-
necting the indicators were investigated. In the first interconnection scheme
all computers set LI, but tested only L2 and L3. That meant that the
meaning of L2 and L3 was different in each channel. The interconnection of
the indicators was permuted in the wiring harness. This interconnection
results in the least software to set, clear and test the sync commands. The
second scheme did not permute the meaning of the indicators. Each computer
determined which slot it was plugged into, and set either LI, L2 or L3.
This resulted in the corresponding indicators in the other channels being set.
cr
LTl
(V^XXXXXX
crCO
COcr
i—iXXXX
xxXX
crcr•zr
i
—
 1
•
—
 1
XXXXXXXX
C
D
N
^
Oi
 —
 1
<—(
XXXXXXXX
GOUJI—<QQ
O>
-
<toLUce
QO
338
4.3 (cont'd)
The synchronization performance for a particular implementation of an
algorithm is measured by the SKEW between the three channels. SKEW is
defined to be the difference in time when any two channels reach the SYNC
point in a particular implementation. Worst case SKEW would be the largest
SKEW between any two channels.
Because of the wait loops in the mechanizations described, each computer
will detect that another is ready for synchronization some variable amount
of time after the LSC is set in the othe*. 'tiputer. The time will depend
upon where the local computer was in the wait loop at the time the LSC
was set, but is bounded by a minimum and maximum value. Looking at
Figure D-7 let T be the time when the sync algorithm is satisfied. Some
time t, later is the shortest time that a computer can respond and get to
the sync point for a particular implementation. The time t£ would be the
maximum time for which it can be stated with absolute certainty that the
indicator will be detected. The maximum SKEW = tg - t^ would be defined
by \~2 °f t°e latest of the three computers and tj of the earliest to arrive
at the sync point.
The first sync algorithm, the "Wait" approach, requires that all computers
(for no detected failures) be ready before sync release is achieved. Figure
D-8 is a simplified flow chart for this algorithm, which has the following
characteristics:
• Set LSC
• Wait, testing the FSC, until one of the other computers is ready.
• If neither is ready before time limit is exceeded, interpret this as
loss of sync.
• After second channel is detected, wait for third channel.
• If time limit is exceeded for third channel, mark it failed and continue
to SYNC point.
• Clear LSC.
• Test sync indicators to be sure none failed1 set.
339;
| 340
'//// ' —
'—'1—1 .
!. „ .
PERIOD OF TIME INDICATOR
CAN BE DETECTED.
FIGURE D7, SKEW DEFINITION
START
TEST SYNC
INDICATORS
CLEAR
SECOND
COMPUTER
READY
TIME
LIMIT
EXCEEDED
LOSS OF
SYNC
GO TO
SYNC POINT
THIRD
COMPUTER
READY
TIME
LIMIT
EXCEEDED
INDICATE
APPROPRIATE
COMPUTER
FAILED
SYNC
POINT
TEST SYNC
INDICATORS
CLEAR
FIGURE D8
EXIT
THE "WAIT" ALGORITHM
FRAME SYNCHRONIZATION
GO TO
SYNC POINT
4.3 (cont'd)
The second sync algorithm, the "Master" approach used the concept of a •
i
master computer, as shown in Figure D-9 and has the following characteristics;
• Test sync indicators for clear.
• Set LCS. !
• Test sync indicators for set.
• Select master computer based on the failure information available. j
• If local computer is master, wait for a time equal to maximum
allowable SKEW, then clear LSC.
• If local computer is not master, go into a wait loop testing master
computer's sync indicator until it is cleared.
• Test sync indicators to be sure none failed set.
The relative merits of each approach become more apparent when the
practical consideration of failure detection is discussed.
The primary requirement for sync routine failure detection is that loss of
synchronization be detected. In addition, information can be collected
that the local channel is a sync with only one other channel. This information
is then passed on to the executive. If a local failure was detected, the
executive may wait until the error has been detected in several consecutive
frames before disengaging the servo actuators.
Figure D-10 shows the real time "window" that defines the period of time
within which all channels should set their LSC. For the cases where one
or more computers are outside that "window", there is no information avail-
able in the local computer to allow it to detect the failure. The only real
time information available to the local computer is that it has had a timer
interrupt. Since the three clocks are not perfectly synchronized, this
occurs at slightly different times in each computer.
342
START
TEST SYNC
INDICATORS
CLEAR
i
SET
i
LSC
TEST SYNC
INDICATORS
SET
DELAY
TIME
SELECT MASTER
BASED ON FAILURE
INFORMATION
I YES
WAIT
FIXED
TIME tw
1
CLEAR
LSC
\
MAa
SYNC
POINT
i
itnr )
_VNO '/ '
< MASTER \READY? f^Q/1 YES
CLEAR
LSC
1
TEST SYNC
INDICATORS
CLEAR
EXIT
FIGURE D9 THE "MASTER" ALGORITHM FOR
FRAME SYNCHRONIZATION
[343
# 11
#2
Jf O
tr o
# 1
*1
tf T
1
# 1 J*
#?
Vf *>#3
REALTIME
1 ' 1
L
 t ,J. J. td ,| 1 (a) ALL COMPUTERS WITHIN
1
 ' i i , ' TIME WINDOW1
 1
1
 ! I '
SYNC RESET 1 ( b ) ONE COMPUTER LATE
1 1
i .
~
 lw ' H "
' i !r i _j i
h"*w 1 L, •• td J|i i
i L t . , JIf* ld *1
I , | lYNP RFIFT
1
 i !
r~tw~i*-j — tH — ^ i
1 1 , ' (c) ONE COMPUTER EARLY
L » ^ 'i ™ H " . , i
. . i1
 i '
SYNC RESET
FIGURE D10 "TIME WINDOW" POINT OF VIEW FOR FAILURE DETECTION
344
4.3 (cont'd)
The method of failure detection proposed for the "Wait" algorithm is to define
a maximum period of time during which each computer waits for the other
two. If they are not detected ready to sync, the local computer indicates
a failure. This time period must take into account tolerances on the three
clocks as well as variations in software execution time due to branching
decisions.
Figure D-ll shows a mechanization where all computers set their LSC,
then wait a period t to see if the FSC's are set. A maximum difference
of 2t can exist between the time the first computer sets its LSC and the
ri
time last computer sets its LSC, for the case of no detected faults. This
time is twice as long as the tolerance specified for a single computer
because the wait loop for both computers is re-initialized to t^, after the
second computer is detected ready to sync. Both computers must wait
for the same period of time for the last one in order to minimize SKEW.
The time t, is the time the LSC is set during the sync routine after the
computers have synchronized. This is a fixed time in all computers. If the
third computer arrives after the maximum wait time of the other two, the
first two will call the third failed. However, if the LSC of the first two
computers is still set when the third checks, the third channel will detect
no errors. All channels would detect the error if the FSC of the first two
computers was clear when the third channel checks.
If the first channel is ahead of the others by more than t^,, it will detect two
failures. In all of the mechanizations studied, two simultaneous failures
are not considered probable. A simultaneous failure is defined as two com-
puters failing within approxirra tely one frame time. If the local computer
detects neither of the other channels ready for synchronizations with no
previous failures, it assumes itself failed. The other two channels will
detect no errors if the LSC of the first channel is still set when they check.
345
# 1 *w
1
 I
SYNC RESET
( a } ALL COMPUTERS JUST UNDER MAXIMUM TOLERANCE
# 1 _f* — w "•**1 "^
1
k t
#3 • 1
. *d .!
i
t ~"j
i
t -Ja ^^\
, 1 & 2
 1 & 2 3 DETECTS
FAIL 3 RESET NO FAILURES
&SYNC
(b) COMPUTER #3 LATE
U * A t,# 1 ™ w I d1
* - '1
1
# o
t*
r ld-
•
 ld-
1 1
1 FAILS 2&3
SELF DETECT NO
FAILURES,
SYNC WITH 1
•
2 & 3
RESET
(c) COMPUTER # 1 EARLY
FIGURE Dll FAILURE DETECTION BASED EXCLUSIVELY ON WAIT TIMES
346
4.3 - (cont'd)
This shows why the time t, should be as short as possible in this mechani-
zation. If it had been about half the time shown, all computers would have
detected the failures in both cases.
The failure detection capability can be improved so that the failures shown
in Figure D-ll are detected. This can be done by adding a test at the
beginning of the sync routine to check the FSC's clear before setting the
LSC. The local computer must wait 2t after checking the FSC's and before
setting its ESC. By waiting for this time period, all computers within
tolerance will be guaranteed not to set their LSC before the latest one checks
it for clear.
From the point the indicator is set, the algorithm remains unchanged.
Figure D-12 shows the last two cases in Figure D-llwith the addition of this
test. In both cases, the test occurred when the FSC was high. The last
case in Figure D-12 shows that all failures are still not detected identically
in the three computers.
This illustrates that it is extremely difficult to guarantee unambiguous
failure detection in all computers. For this reason, the test of the local
computers sync command as well as the foreign computers indicators is
included in each computer. Also, the redundant test of the sync indicators
being cleared helps. The additional information tends to increase the
probability of detecting failures.
Accurate failure detection is a basic requirement of the sync routine using
the concept of a master computer. The master is selected based on whether
channel #1 or #2 has failed, so all computers must make the same decision.
The ideas of failure detection developed for the "Wait" algorithm can be
applied to the master computer approach also. A test of the sync indicators
clear at the beginning and end of the routine will be assumed. Figure D-13
#2
-td
1
< 2tw »•
1
3 FAILS
SELF
lw -
1
I
i
1
J
 tli f -t t J
* i ld T
1 &2 |
FAIL 3 ,
&SYNC 1&2
RESET
(a) COMPUTER # 3 LATE
CHECK SYNC
INDICATORS
CLEAR
#3
I I
L. 2 1 t t
• w j* d
' I I I
i i I « .i i 2 1 i r
—
 . , . *
 Lvu J .#2 £-. ! 3 (• * i J• d *\_
I I
2 & 3
DETECT
1 FAILED
I
I 1 FAILS
1
 SELF
' I
I
2 & 3
SYNC
2 & 3
RESET
(b) COMPUTER # 1 EARLY
•• w "•" — lw~~^1
. ^
Tw .* t »•* w tw -,-
1
I 2tw '
^\
I 1 &2
' FAILS
3FAILS1 &SYNC
ld 1_
t , .
.
 ld !
3 SYNCS l'&
WITH 2 RES
•\
2
ET
(c) NON IDENTICAL FAILURE DETECTED
FIGURE D12' ADDITIONAL FAILURES DETECTED WITH TEST FOR
SYNC INDICATOR CLEAR
AT BEGINNING
OF ROUTINE
348;
-'w-4.
# 2
w
w
i I
SYNC&
RESET
a ) NO FAILURES, MASTER LAST
# 1
# 3.
# 1
# 3
w
^L
w
'wH-r -<d-
i *VM4i » i 1 1
k-2 FAILS 3i
3 FAILS SELF
(b) COMPUTER # 3 LATE
RESET
tw J
1*f
1
I
1
1
1
t ' trl JVJ | 0 *1
1
* .
 w
 '••
 Tw •••1
 ' 11 1h— itw »« , •!• tr- -f <™ | - tw •]• Ijj
' i l l ,
I 1 t K- 1 FAILS SELF
I ' 1
. k- 2 FAILS 1 SELECT 2
t i A t . .
'd "f lw *
4* 7t J
' 2 tw M
' i
1
2 & 3
SYNC&
RESET
3FA.LS1 ) MASTER
(c) MASTER COMPUTER EARLY
FIGURE D13 IDENTICAL FAILURES DETECTED IN ALL
COMPUTERS FOR MASTER APPROACH
349
4.3 (cont'd)
shows a case where the master computer is the last to set its LSC. All
of the computers are within tolerance, but the maximum deviation between
computers is one-half of what it was for the other methods. This is because
all computers must arrive within the wait time tw defined by the master
computer. Each computer checks to see the sync commands are clear,
waits tw and sets it LSC. They then wait another period tw and check to
see the sync indicators are set. A time period t, expires, then the master
computer waits for a time t^, to allow all other computers to get into a loop
testing the masters sync command. It then clears its sync command and
continues. As can be seen in Figure D-13 the computers which are not
master must wait for a time period of 2t for the master, in order to allow
for the possibility of being early. Synchronization is achieved on the trail-
ing edge of the LSC, exactly as the iteration timing reference is being reset.
Figure D-13 also shows a case where computer #3 is slower than the master
by more than t . Because of the tests of the sync indicator being clear
and set, all computers detect the failure. It also shows the master arriving
before the other two by more than t . This failure was detected in allJ
 w
computers, which required #2 and #3 to select #2 as master. The sequence
of events that results in detection of different failures in each computer is
possible with the "Master" computer approach also.
All of the frame synchronization methods described here perform the sync
task adequately, using the candidate hardware configuration, when all com-
puters are within tolerance. One of the primary tasks defined at the start
of the study was to determine the maximum SKEW for the various routines.
It was believed that this would be very instrumental in selecting the synchro-
nization routine. After doing a detailed analysis of the three methods
described, the SKEW problem was found to be virtually nonexistent. The
tolerance on the oscillators used for the iteration timing reference is 0.
which is a deviation of only 1 y sec for a 10 millisecond frame time.
350
4.3 (cont'd)
Therefore, synchronization is required to prevent only long term drift.
All of the sync routines described will hold the SKEW to less than 10 micro-
seconds for no failures.
As the study progressed it became clear that detecting failures identically
in each computer was the most difficult problem. All of the methods
described allow some period of time where one computer may detect another
failed, but the third will not. The sync routine just tries to detect those
failures that prevent the local computers from synchronizing with one or
more other channels. The overall failure detection task is the responsibility
of the system executive software. If simultaneous failures are considered
possible, then the executive must be very careful not to disengage the third
channel when loss of sync is detected with no previous failures. This would
result in loss of the system with two failures.
After further study it may be discovered that additional information is
available to the executive which allows it to perform the failure detection
task acceptably. Additional failures might be detected if separate discretes
were used for the sync command and for resetting the iteration timing
reference. This would allow the local computer to leave its LSC set when it
detected itself failed. The other channels would then detect the failure. The
local channel's watchdog monitor would not trip out if its iteration timing
reference were reset. The failed channel could disengage its actuators and
try to resync with the other computers.
Reviewing the information presented in Figure D-6 shows that the selected sync
algorithm, Method IV, has all of the desired features. Its operation does
not depend on unambiguous failure detection information, which makes the
"Master" algorithm unacceptable. It re-evaluates failure status each time,
so that the recovery process is made easier. Finally, the memory size
and SKEW are the best, considering the functions it performs.
4..4 WATCHDOG MONITORS-TRADEOFFS
From the discussion of software frame sync routines in the previous section,
it is apparent that the LSC discrete will hold its period and pulse-width
within "tens" of microseconds. Considering this, it is reasonable to
conclude that the watchdog monitor should have a corresponding time
interval tolerance; say, 50 y sec over a 10 msec frame time. Such precision
suggests a digital counter design for the watchdog monitor.
There is, however, a valid tradeoff to be made concerning the use of a less
precise, and simpler, analog watchdog monitor design. The tradeoff
hinges upon the questions of what constitutes "adequate" second failure
coverage, and how significant is the watchdog monitor to this coverage?
At this point in the ARCS study only a qualitative answer is possible. The
digital design is favored because it offers the "maximum" watchdog monitor
performance at a very slight increase in hardware complexity.
4. 5 SERVO MONITORING TRADEOFFS
Two primary factors influence the servo monitoring tradeoff; they are:
• Servo-actuator and electronics design
• Reliability requirements.
The central question in the trade-off concerns the location of the boundary
between hardware and software servo monitoring. In order to address
this question, it is first necessary to establish the failure monitoring
requirements in terms of the above two factors.
Figure D-14 presents the baseline ARCS triplex force summed actuation
configuration which combines low pressure gain servo valve/actuator
modules and cross-channel monitoring with self-monitored, independent
channels.
The actuation scheme imposes a requirement that the multiple inputs and
feedbacks to the actuator channels track each other within some allowable
352
COccoCJ
<cXLU<Q
353
4.5 (cont'd)
tolerance. For the actuator feedbacks, this means the LVDT gain tracking
as well as the excitation voltage must be controlled within an allowable
tolerance. Experience with the 680J actuator and the HLH-ATC actuator
has shown that feedback gain tracking can be closely controlled. Multiple
analog inputs from the digital/analog converters of synchronous digital
computers or from voters in an analog system have been shown to track
within tolerances of less than ±1%. Previous experience has shown that the
basic force summed actuator with single stage, low pressure gain valves,
will operate and meet specified performance without the need for differential
pressure feedback equalization if the required tracking tolerances are
satisfied. Triplex force summed actuators with two stage, high pressure
gain valves, such as the HLH-ATC actuators require differential pressure
feedback in order to operate without being in a constant, saturated force
flight condition.
As shown in Figure D-14, the failure detection/isolation scheme consists
of comparison monitoring between dual input and feedback signal paths in
each actuator channel as well as cross-channel monitoring of servo valve
currents in the digital computer. For a single stage mechanization, the
current in the valve is proportional to the differential pressure across the
actuator piston.
The independent actuator monitor consists of a current comparator at the
output of the two servo amplifiers which can detect a hardover (active) or
open (passive) failure in one of the duplicated signal paths. The single
comparator will detect feedback failures such as a broken LVDT probe,
open LVDT winding and loss of LVDT output due to loss of excitation voltage
if each LVDT is excited through separate wires from the power supply in
the computer to the LVDT at the actuator. Passive failures in both paths
due to loss of electrical power to the servo, amplifiers, demodulators, etc.
would have to be protected against by including a power monitor in the fail
4. 5 (cont'd)
detect electronics. The single current monitor would also detect an open
wire to the coil of the servo valve or an open in one of the coils itself.
If the current driver, short circuit proof type servo amplifier is used to
drive the coil of the servo valve and a coil short occurred, the current
monitor might not detect this failure. However, an additional comparison
of the voltage across the coils could be added if the probability of a short
is not considered remote. To protect against hydraulic system pressure
loss or a plugged filter or nozzle in the jet pipe servo valve, two pressure
switches connected to the extend/retract struts of the actuator are used.
Both switch contacts would be closed with loss of pressure and would be
used to annunciate the passive failure.
The cross-channel monitor shown as a part of the digital computer provides
additional coverage of failure modes not detected by the comparison
monitoring. This monitor would detect a failure in the control computation
of one computer which feeds both input paths of an actuation channel and
could cause a large difference between the currents (pressures) in the
three actuation channels. Now the cross-channel monitor could be imple-
mented such that upon second failure the remaining two actuation channels
would be shut down. However, with force limit valves contained in each
actuator which restrict the force difference between the two remaining
actuators to level below the friction of all three channels, the actuator
output will not drift hardover upon second failure. Thus, the cross-channel
monitor can be inhibited from calling for shutdown on second failure.
The combination of monitoring schemes implemented in the candidate
actuation system for ARCS should provide the essentially unit coverage
required for first failure isolation in the actuation system. The duplicated
signal path comparison monitor provides the required independent, highly
reliable output servo monitor required by the ARCS ground rules. The
single stage servo valve, triplex force sum actuator and monitoring scheme
355
4.5 (cont'd)
in Figure D-14 satisfies the ground rule that: "Each computer may control
the disconnection of the servo it feeds and shall not directly or in conjunc-
tion with other computers effect the operational status of the other servos."
Some other actuation schemes such as the triplex, active/on-line mechani-
zation used on the Boeing-Vertol HLH Direct Electrical Linkage System
would not appear to satisfy the ground rule given above since one channel
is active and controls the output. If the active channel fails, it must trans-
mit a signal which makes another channel active to control the output.
4.6 NONVOLATILE MEMORY TRADEOFF
Computer systems of the future will be designed to have very low heat dis-
sipation and power usage, and to meet high reliability specifications. For
these reasons core memories are being replaced by semiconductor
"RAM-ROM" memories, and with this change goes the system's capability
of non-volatile "scratchpad" memory.
Presented here are two possible ways to implement a small non-volatile
"scratchpad" memory for storage of maintenance information.
C-MOS RAM memories are relatively new in the semiconductor memory
market, and have one major advantage over all other semiconductor
memories: when not in a cycling mode (j\ist holding data) they use practically
zero power supply current. Because of this characteristic, the C-MOS
memories can be connected in a circuit, where a battery supplies the
necessary power supply current after the normal power supply has been
shutdown, thereby causing the C-MOS RAM to maintain its storage. Since
the leakage current of the battery is larger than the current required by
the memory, the battery can supply power to the memory for practically
the shelf life of the battery. Rechargeable nickel-cadmium batteries can be
used with the battery being charged by the normal power supply during system
operation.
356
4.6 (cont'd)
Advantages
. • Simple low-cost implementation
• Easily interfaced with CPU (fast read and write times)
• Operates from 5V power supply
• Has reliability associated with older technology
• Available from many sources.
Disadvantages
• Nickel-cadmium batteries operate over a limited temperature range
(low current requirements of C-MOS will allow operation over a
range of up to -40° C to +65° C ambient).
• Batteries must be replaced after a specified time as a maintenance item.
• Memory bits can be lost if power is lost for any amount of time,
therefore, maintenance people must be careful, when transporting
the memory module, not to short or otherwise effect the battery
power to the memory.
MNOS is a new P-channel MOS technology for implementing electrically
alterable non-volatile memories (actually electrically alterable ROM's).
In these devices binary data is stored in an array of MNOS transistors.
The storage mechanism involves a change in threshold of the transistors
when charge is injected under the gate electrode by the application of a high
voltage (20 to 30 volts).
The technology is very new and therefore has all of the problems.associated
with a new technology. Asa result it can not be seriously considered as a
viable solution to the problem at the present time; however, it looks very
promising and should be ready by the 1980's.
Advantages
• Data is solidly stored in the memory, maintenance personnel do not
have to be careful of the power supply when handling the memory
357
a. 6 . (cont'd)
Disadvantages
• Limited temperature range at present (-25° C to +70°C)
• +12V and -12V operating power supply
• -24V and +30V programming power supply
• Memories must periodically be replaced as a maintenance item
• Poor reliability associated with a new technology
• Single-source supply
• Very long write time .
1358 i
APPENDIX E
[FAULT ANALYSIS
The viability of the ARCS reconfiguration process was demonstrated by simulating this
process. This appendix discusses three simulation runs that show the systemJ-s" potential
ability to attain normal operation after power on and to continue operation after experi-
encing transient power faults.
Since power on recovery exercises the major portions of the software needed to recover
from a watchdog monitor trip, it was decided that a run with a watchdog monitor would not
add to the understanding of that process.
The sensor reconfiguration problem was deemed to be of such magnitude that to simulate
it would be far out of the scope of the ARCS study. The power on sequence simulation
did show that the interface between computer and SSFD recovery is suitably defined in
the ARCS Software Design.
The following paragraphs discuss the simulation results.
Triplex Power-On
Table E-l shows the hardware indicators for this simulation run. The power for A, B,
and C comes on at. 02, 0.1, and 0.2 respectively. Power-on interrupts are seen to
occur at these times and the iteration reset interrupts begin.
Table E-2 shows the synchronization status of the three computers. A is initially alone
as shown by the octal number 4. At 0.1 B comes on and the synchronization is duplex
with A and B. At 0.2 C comes on and each computer shows triplex synchronization.
J359
357
. ... POWER
•TIME ;
. CP'Jj
. v '<- 0 ')
• O^Ou
.uoO-J
.ureu
. iOOG
. 12CO
• . 1 <t f u
.:u.fO
.1600
. ^  (•"..!
.2200
. 2<i f \>
.?> 00
. 2 '6 JO
«.3i ,V. 0
. 320o
• J *t % V.-
. 361-^
. 3 o 0 -J
.4000
. '• 2 ; • o
.4 sue
u f~ ' r"
* T ij »/ »,*
. . H t O Q
. :>',«'> 0
.52 Co .
. 2 *t 0 0
. 5 6 v., 6
. rS. 'O
. 6''C'0
.5200
. fcAC-C
.bt-OC
.66,- 9
. T O C O
.7200
.7400
.7600
. 78 vO
. e c c - o
.6200
.5400
.o fcOO
.8ov/0
. 5i.r,o
. 92-jO
• . v403
.9o 00
. 96uO
l.tCCC ~"
r
"A
'F
T
T
T
T
T
I
1
T
T
r
r
T
r
7
T
T
T
r
i
f
7
r
7
r
7
T
i
T
r
7
T'
T
r
r
T
1
T
T
T
f
T
T
T
T
T
T
T
T
7
T
POWER.
B
F '
F
r
f
F
T
1
T
T
7
T
7
T
T
T
T
7
7
7
. T
T
7
7
T
f
I
7
7
T
T
7 ,
1
T
T
7
T
T
T
7
T
T
T
T
7
T
T
r
T
T
T
T
-; '• WATCH-DOG .;-
C
-
 F
F
F
F
F
• F
p
F
F
F
T
T
T
7
I
7
T
T
T
T
T
T
7
T
f
'1
T
T
1
T
7
7
T
T
I
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
A
• -F~ • •
T
T
T
T
T
T
T
T
i
T
T
f
T
1
r
r
T
i
7
T
T
T
T
T
T
T
T
T
1
7
T
T
T
T
T
T
r
T
T
T
7
T
T
T
T
T
T
7
T
T
B
F"
F
F
F
F
T
T
T
T
r
T
T
T
T
T
T
T
T
T
T
T
T
.T
r
T
T
T
1
T
T
T
T
T
T
T
T
T
r
T
T
T
T
T
T
T
T
T
T
T
T
T
C"
"F ~
F
F
F
F
F
i-
r
F
F
T
T
7
1
T
T
T
T
T
T
T
T
T
T
T
i
T
T
T
T
7
T
T
T
T
T
T
7
T
T
T
T
T
r
T
r
T
T
T
T
T
INTERRUPT
A
. _F. .
r
F
c
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
r
F
F
F
F
F
F
F
• F
F
F
F
F
F
F
F
p
F
F
F
F
r
F
F
r
F
B
"F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
. F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
p
F
F
F
C.
~F -
F
F
F
F
F
F
F
F
F
T
F
F
c
F
F '
F
F
F
F
F ,
F
F
F
F
F
F
F
F
F
F
F
F
F
F
' F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
f
ITERATION
.RESIT
A
F
T
T
T
r
T
r
T
T
T
T
T
T
T
T
I
T
T
T
T
T
T
1
T
7
T
T
7
T
T
T
T
T
r
T
, T
T
T
T
T
T
1
T
T
T
T
T
T
T
T
T
B
F
F
F
F
F
T
T
. T
T
T
7
T
T
T
7
r
T
T
1
7
T
T
T
T
T
7
T
T
T
T
7
T
T
T
T
T
7
T
T
T
1
T
T
T
T
T
T
T
T
T
T
C
" " F
F
F
F
F
F
F
F
F
F
7
T
T
t
T
T
T
T
T
T
T
T
7
T
T
T
T
T
T
T
T
7
T
7
T
T
T
T
T
7
T
T
t
T
T
T
T
T
T
T
T
360
TABLE: E-l SiMULATED?:feipWR! INDICATORS;
ORIGINAL PAGE IS
OF POOR QUALITY
TIME A B 
) 
• L f 
• '. i, ~ C 
• { f 
• \,/1,_ I 
• .l ; 
• .1 2 ~ 
• • l. , 
• j h,' 
· (. 
· / ( , 
· ) ~ 
• <'.. 4 
• L.. 
., -
• fa .. .l 
• ..l 
• 3 " u 
.j l ) 
• " I 
• t, .:. I 
.. 
• J • 
• . t ' 
'" I 
· " 
• v t, 
• L '-
• (- <t • 
• t • 
o ld J 
. 7 
• 1 2 l' 
o -I " ~. 
. -(v\) 
• ( t ) 
• tI '. v 
o 2 f 
o C " 
o c-
o ' f J 
0-' ? v 
. C'4 0 
. 1, .. 
~ 
t 
6 
6 
I 
I 
I 
7 
7 
7 
7 
7 
7 
I 
7 
7 
7 
7 
7 
I 
7 
-, 
7 
7 
7 
7 
7 
7 
I 
7 
7 
I 
7 
( 
7 
7 
7 
7 
1 
7 
7 
7 
7 
7 
T 
t 
A 
( 
7 
7 
7 
7 
I 
7 
7 
'f 
7 
i 
7 
7 
1 
7 
7 
I 
I 
7 
7 
7 
( 
I 
7 
I 
l 
( 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
C 
o 
J 
'. 
7 
7 
7 
7 
7 
7 
-I 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
"7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
1 
( 
7 
7 
7 
I 
TABLE: E-2 SYNCH RO NIZAT ION IND ICATORS 
Vi .. " .dL h.: l;., S 
OF R Q A ITY 
--------- ----
36 1 
Tables E-3, E-4, and E-5 show the recovery indicators for the A, B, and C computers.
At 0. 02 A starts power-on recovery in simplex. At 0.1 B comes on A sets B recovery
request flag and immediately releases its B permanents since no other computer is
attempting recovery. At the same time B releases its own permanents and starts
recovery.
At 0.2 C comes on. Since B is still recoverying C's permanents are not released.
This can be seen in Tables E-6, E-7, and E-8. At 0.22 B's do-not-use is reset,
therefore C can begin recovery at 0.24. Until this time, C had run as if it were B.
Tables E-9, E-10, and E-ll show the input from the sensor, the selected signal, the
limited input to the filter, and the filter output. The point to note is that the recovery
does not introduce any errors in processing.
362
c^ 0^
rO^ *Vri£
" <&° ^
T I M E
. u G O "
.( 2C
.040
. C' o J
.ceo
.IC'O
. 120
. 1 4 v
. 1 o G
. i c- " •
• ?v %.
. £LO
. 2 <i C
.2cC
.283
. 3 u O
.320
. 340
.360
.3 fcO
« "t '/ V
. <i ;_0
.'.40
. ^60
. 4 c.3
.^)0
.520
. '. > *t 0
. !; 6 j
.5 or
.MO
.62"!
. e ^  3
, c 6 0
. f c b u
. 7C80
.72J
. 7 ^ L
.76C
.760
.ett
« ? 2 0
,b4 ^
. c.; fr 0
.650
• V v'O
.^20
.9*0
.900
• 98C
I. TOO
-1
i
0
0
•J
s
0
U
0
,j
3
o
0
0
o
0
0
0
G
C
0
0
u
o
0
- 0
0
u
0
G
o
a
0
0
y
0
v
0
0
^J
c
0
J
0
0
c
0
0
0
0
0
A
0
• V
0
0
f)
Q
0
, *
0
c
•N
•J
o
u
0
G
0 .
c
0
V
0
c
c
0
0
0
I,
0
0
u
o
0
0
0
0
G
0
t
0
n
0
'j
0
0
0
0
0
0
c
0
0
c
B
0
c
0
^0
G
0
0
0
0
0
0
y
0
0
0
u
0
0
0
0
u
0
0
0
w
0
0
u
0
w
o
V
0
0
G
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
c
0
0
n
0
0
0
0
•T
0
0
i
1
0
0
0
a
e
0
0
0
0
c
0
0
0
0
G
0
0
n
0
0
c
0
V'
0
0
0
0
0
n
c
0
G
0
0
0
0
0
0
0
TABLE: E-3 RECOVERY INDICATORS. FOR. COMRU.TER A
ORIGINAL PAGE IS
OF POOR QUALITY
[ T I M E .
.000
• C cO
. C 4 0
. cb f
. c e r.
. i O C
.320
. 1 H U-
.160
.180
.200
.223
• 240
.260
»c PQ
. 3 r- ;s
.32 0
• 34 j
.360
.360
. 4w£
. * 2 *
.44 W
.460
• 4 BO
• !>00
. 5 2 C-
• 540
• t'CJ
.560
.6'..-0
.fc?.C
.640
. D f c O
.68J
• 7C5
.720
. 7 4 C
.76f
. i fcO
. e-O.C
.SZU
. e ^  o
. Ci60
.S80
. 9 ;/ 0
.920
.540
.96C
.96C
. .L'33
-i
' -i
-I
-1
-1
1
o
0
0
j
^0
0
e
0
T
0 •
V
0
0
0
0
v!
- 0
0
0
e
0
u
0
0
0
0
G
0
0
0
u
0
0
0
0
0
0
0
0
0
0
0
0
0
A
6
0
•J
0
J
0
c
0
0
0
0
0
0
0
0
0
0
c
0
c
0
0'
u
0
ii
0
0
0
0
D
0
D
G
0
e
0
0
0
0
c
0
G
0
0
0 . .
c.
c
c
0
c
c
B
0
G
0
C
0
0
0
0
0
0
0
0
0
0
u
0
0
0
0
c
0
0
J*
',t
0
0
0
3
0
0
0
0
0
r-
u
0
0
0
0
0
0
0
0
0
0
0
D
0
0
0
0
c
' c
o
0
0
D
p.
0
0
0
0
u
1
1
0
0
0
1
0
0
0
0
0
0
c
o
0
0
• r?
o
0
c
0
o
>
0
0
0
0
0
0
0
0
0
u
n
r.
0
0
0
0
0
0
TABLE: E-4 RECOVERY INDICATORS FOR COMPUTER B
364 ORIGINAL PAGE IS
OF POOR QUALITY
oTIME
.000
• C 1 C
.040
. i: 6:)
.ceo
• JL 0 ' -^>
.120
. 1. <• :.-
. *. c v
. 1 t j
• 2v i>
.220
.240-
. ibC
»t !cJO
* 3 u 0
.32-
.340
.360
.360
. 4 & ?
. S t .;
. 4« tO
. 4 c 0 '
.41.0.
.i'OJ
.- dO
• i<4v
« t?fcv>
.560
.600
.62^
. 64C
. 0 fc U
.680
.700
.7^1)
.740
.7c3
,7bO
. 6vD
.ezc
• b •'» i
. coC
.dcO
.900
.920
.<54i
.9cO
.VoO
.(100
-1
-I
-1
-1
-1
-1^
— i
-1
-1
' -1
i
0
0
0
i
0
e
0.
0
0
»'
v>
0
0
0
V
w'
0
v>
0
•**.
0
,)
0
0
5
0
0
0
0
0
0
0
0
0
0
u
y
0
0
3
0
A
0
0
0
0
0
a
0
0
0
0
0
0
0
0
c
u
u
0
c
0
(•
o
0
G
0
0
'0
0
c
c
0
0
0
0
0
0
c
0
6
0
G
0
0
0
0
0
0
0
0
v
0
B
0
c
0
n
A
0
0
c
0
0
c
0
c
0
0
c
£»
0
0
0
0
••>
0
0
0
0
0
if
0
0
u
0
0
c
0
0
u
0
c
0
0
0
6
0
0
0
0
0
0
c
0
c
f\\J
1
0
f\
0
0
0
n
0
0
1
1
0
0
0
u
f:
0
?
0
0
0
0
0
•\
0
0
0
c
0
0
0
e
0
0
0
0
0
0
0
l)
0
0
c
0
c
0
c
0
0
f!
TABLE: E-5 RECOVERY...!ND.ICAI.O.RS; -FOR XOWRUTER .C.
365
Tl COUNT T2 COUNT
TIME
..a?
.L20
. fv 4 C*
. (, 0 (/
. .'• B w
.100
.120
. 1 4C
.160
. i 6 J
. 2^0
.220
. <r <»<"'
. 26 j
.2 fiO
. 3 .,• .5
. i 2 0
. 340
.360
. 3 o»":
. * (./ C
• 't tiU
.<i 40
. n60
. 4bO
. 5 r.~ 0
.523
• 5 <t C
• 1 few
.550
.600
. t^O
.64 3
. c c. 0
. t o .">
. 700
. 720
.7^0
.760
. i B 0
H~ / ' \
. - u J
• S 2f
.840
.c6u
. 6 o 0
.9C';.;
.920
.940
.960
.960
.OC'C
A
;
j)
0
o
j
2
J
G-
J
w
J
J
A>
0
0
-.y
0
j
0
0
J
j
0
5
0
J
•J
J
n
•J
0
;;
0
)
n
0
•J
o
•)
0
0
0
o
0
1
0
-J
0
0
0'
B
^
u
7,
j
,j
I
.1
i
1
1
i
'*;
0
o
0
.;•
.j
0
c
'.J
o
J
*•
0
0
0
~J
0
."•.
G
j
0
•3
o
J
c
")
J
3
3
0
5
j
j
0
•y
•j
5
0
3
0
c
J
aj
y
,.<
0
0
it
0
V*
•J
ft
i
j.
1
i
i.
i
S.I
0
0
0
V/
\.*
c
0
3
0
0
u
C
0
n
(.>
r-
C
0
u
0
J
0
G'
0
0
0
o
c-
C
0
u
0
"A
0
0
u
0
n
u •
0
c>
. 0
C;
0
!*
0
C
Wl
V
0
0
0
c
0
0
f.
c
•5
c
G
0
•t»
0
(,
0
r
0
0
0
r
C
0
r
C
0
0
0
0
0
0
c
u
u
r
B
0
i
1
1
!
U
C
0
0
V
0
c
0
0
3
0
0
0
f s
c.
0
0
u
0
0
0
c
y
ft
c
G
a
0
0
0
0
r
0
0
0
c
0
0
t
0
0
0
0
0
0
0
c
r
i
i
i
i
i
i
' i
T
1
1
1
0
c
0
V
0
'0
0
0
J
0
0
0
J
0
G
0
ft
0
G
0
0
0
(J
p
c
0
0
0
0
0
0
'w
0
0
c
G
0
0
0
A
n
0
0
0
,**i
0
0
0
0
0
0
0
0
3
C
o
c
0
0
•)
0
c
G
0
0
0
0
0
0
0
0
0
r,
u
0
0
•>
0
0
o
U
3
0
0
0
0
0
0
0
0
0
- B
0
0
0
0
i)
i
2
3
<r
5
6
0
0
T
C
0
0
0
0
C!
0
0
0
0
a
0
0
u
0
0
0
0
3
0
0
0
p
0
0
0
0
0
0
0
0
0
0
3
0
0
0
c
c
c
0
0
c •
0
u
u
o
0
c
0
1
2
3
4
5
.6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
y
0
c
0
u
0
a
0
0
0
c
0
0
0
c
0
0
C;
A
0
0
u
•J
t-
u
0
u
0
o
J
0
0
0
0
0
0
G
. o
0
n
0
0
0
0
0
0
0
0
0
C;
u
0
0
0
c
0
0
0
0
u
" 0
0
0
c
0
0
" o'
0
0
n
•B
V
0
0
0
' 0
1
2
J
<t
5
6
0
0
0
0
0
'J
0
0
0
c
u
0
0
0
0
0
0
0
0
0
0
0
Q
0
3
0
0
o
0
0
u
0
0
0
0
3
• o
0
0
0
c
0
0
0
0
•)
0
0
Q
0
0
0
0
1
2
3
V
5
6
0
0
0
.J
0
0
0
6
n
0
0
0
a
0
0
0
J
0
0
0
0
0
0
0
c
0
0
'J
0
0
0
0
0
(366
TABLE:E-6. Do NOT USE, PERMANENT FLAGS.AND COUNTERS
_.., _ FOR A, B/ AND C.SIGNALS.FOR.COMPUTER.A .
-V -^
^ ^
^ /•£•
TIME
. '». C ',.'
. (..• 2 0
t r A •%
. C'6c-
.(•30
. i o c
. ic3
. i*G
.it-0
.1 60
.200
.220
. 2< tO
• ZbU
. zee
. i« f'
• iilO
• 3^n
.30U
.3t>:>
.A CO
• 4 2^
. «i4P
.46J
. SbG
. t o o
.L>2G
. * s 0
.5<iC
.580
.600
. ( : 2 C
. t -<tC
• t u \J
.6 BO
.705
.7^0
.7^0
.750
• 7dO
. o t. r
• b£0
.640
.too
.£60
. 9t..C
.920
.SAO
.96*
.980
• fif o
"t
A
v)
'J
J
t
0
^
.0
0
•J.
j
•J
n
j
;
0
„
J
0
0
')
0
3
0
0
0
-J
•S
•\
(;
0
3
0
—
:">
0
0
;>
0
o
0
f.
0
0
')
0
3
0
u
0
0
J
0°
.Bjj
~i
*jj
i
i
i
i
• i
i
0
0
J
0
J
0
0
0
V
Q
0
V'
n
V
0
3
0
0
0
if
0
0
0
a
0
o
0
0
0
w
0
0
0
0
0
o
,3
0
0
?J
c
w
0
V
o
u
0
0
u
0
v.
.»-*
u
J
i
1
.A.
1
i
0
0
0
o
0
o
0
0
,"5
0
0
0
0
u
0
0
J
0
0
u
0
J
*';
0
0
c
0
0
0
c
0
0
0
A
..'
0
v
0
0
r.
f.
c
0
1
c
1*1
0
t
0
t
0
G
u
1".
V
0
c?
G
V
0
n
n
0
0
L
0
C
(.1
c
0
f;
0
0
0
c
0
0
v'
0
f
c
0
l)
0
c
r
-B
0
y
0
0
0
A
ty
0
C
C
0
n
u
0
G
a
0
0
0
0
0
0
0
c
0
0
f-
V
r,
0
0
c
0
C;
C
r
0
0
0
0
u
c,
0
0(i
0
0
0
c
0
0
J
-c
c
0
c
0
0
1
tJ.
i
1
1
1
.1
r\
C1
p
c
c
u
Q
V/
0
0
0
0
n
u
0
C
0
'•)
0
0
0
V
0
0
0
c
0
0
0
J
0
c
G
0
u
0
:•)
0
0
0
.Tl COUNT
A
0
0
0
«j
0
o
?
0
0
c
0
D
0
0
• U
0
s:>
D
0
0
0
0
'o
r,
b
0
c
0
—
 •
0
c
0
0
J
0
0
(;
0
0
0
X*
0
0
0
0
0
0
0
0
0
Y=
B
u
0
0
c
0
I
2
3
5^
6
5
'o
0
0
0
0
0
0
0
c
0
0
o
0
0
0
0
3
0
0
0
0
0
3
0
3
c
0
0
0
0
0
0
0
0
0
0
0
0
s
c
6
• w
a
3
c
a
0
r*
<J
0
j
C
0
1
2
3
<t
5
6
• o
U
0
0
0
0
0
0
Q
0
0
0
0
c
0
c
t
0
u
0
0
0
0
0
0
0
0
0
0
G
0
0
~ 0
. T2 COUNT
A
t.
0
0
u
0
0
t
u
o
v
0
r
0
0
0
0
a
c>
. 0
u
0
c
0
G
0
c
Q
G
n
0
0
c
0
0
c
G
n
0
0
u
0
0
0
0
a
u
0
<r
0
0
0
B
0
•^
0
0
c
1
2
3
4
5
6
c
0
c
0
0
0
0
0
0
c
0
<)
0
0
0
0
0
w
0
0
0
0
0
0
0
0
0
0
0
0
0
j
0
D
\J
0
0
0
0
0
c:
0
0
0
0
0
0
0
0
0
o
0
0
1
2
3
4
5
6
0
0
0
0
J
0
0.
0
0
0
•>
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
J
0
0
0
0
0
TABLE:.E-7 Do .NOT USE,..PiRMANiN:T;.F«LAGs...AND-..COUNTERS
FOR A/ B/ AND.C SIGNALS FOR, COMPUTER B 367;
T
.
.
.
•
.
.
.
.
.
.
•
.
.
•
.
.
.
.
.
.
.
,
.
.
.
.
•
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
IME
v. f1 'J
i. c~\j
r.',.)
I. O'J
U60
iC(
120
1 -t j
160
.' HO
£. t.:U
22*
<:4 <j
z o y
Z 5 0
200
3C. f
34."l
3tC<
38*1
4ffC/
4 2U
^ 4 0
460
4 bO
500
52 r
i>4o
5 60
!?60
c;;.;
><<:i-
o40
c6(:
oa;-
?ro
720
740
7cC-
780
Ho
823
6 4 0
c6D
Pt'O
9 CO
420
940
960
980
C£0
4
i.A
•v
0
0
J
0
u
v;
3
0
r.
v/
•2
j
j
C
0
r
G
j
5
T
,,>
J
0
0
0
V,
\l
o
o
•j
0
•J
' ^
u
0
)
•J
0
T
J
1
0
J
0
J
J
3
0
0
0
^
0
B
•j
c.
0
<-
0
J
0
"J
o
*•:
1
J
0}
-•}
0
0
n
o
j
?
0
<•',
J
J
J
tlJ
•J
•j
0
'.)
0
0
a
,;
D
3
,^
l-
o
•j
•j
j
rj
o
•J
0
0
0
9
n
c;i,
0
0
.,,
0
t K
./
0
v/
u
,5
v
C'
1
1
I
1
1
••1
0
0
^
0
;j
0
V
0
•^
u
'w
0
T
0
0
t.
i_tf
0
0
V?
U
(f
0
3
0
0
0
c»
0
V
0
0
*J
A
.^
c
0
v!
c
i
0
r\.
0
i»
0
\J
0
I/
u
0
•)
L-
0
0
.—
0
u
0
I,
0
c
fV
^0
(•
w
0
0
.')
f
0
0
G
c
0
0
c
0
0
c
c,
0
0
c
0
D
f
B
t-
0
0
^0
0
0
1
0
V
0
r
0
0
0
0
w
c
j
0
0
0
0
0
V-
0
o
0
c
G
f
0
0
0
0
0
0
0
c
0
0
c
0
0
0
u
0
0
0
0
0
c;
0
u
c
0
'J
;>
0
o
J
O1
1
1
c
0
0
0
;„'
c-
0
0
^
u
r.
0
c
0
n
o
0
0
V."
J
0
0
(I
0
c
u
0
c
0
c
0
f!
0
(.
0
0
0
0
3
A
'j
j
0
C
0
a
G
o
o
0
3
0
0
G
0
0
0
o
c
0
0
0
0
0
0
o
•J
0
0
0
0
0
.-*
.0
0
0
0
J
0
0
0
0
0
0
0
0
0
0
Tl. COUNT
~B
0
0
0
p
0
0
0
0
0
6
0
0
0
0
0
a
u
0
0
J
0
0
0
0
0
3
0
0
c
0
0
a
0
u
0
J
0
D
0
0
0
D
0
0
G
0
0
T2 COUNT
c
0
u
c
c
0
0
0
c
c
i
5
6
•J
A
0
0
c
c
0
i,
0
G
0
0
0
c
0
c
r
0
0
c
0
0
A
0
c
0
c-
0
c
0
c
0
0
u
0
0
0
0
0
0
0
0
0
a
0
0
0
0
0
0
c
c
0
0
0
0
0
0
0
0
0
c
0
c
u
0
a
0
0
0
c
0
0
u
0
t
0
0
0
0
6
c
0
0
0
J
0
0
0
V
0
0
u
0
D
D
0
0
c
0
0
0
0
0
0
D
u
0
G
0
0
0
0
0
0
0
3
c
0
0
0
0
0
0
0
1
2
3
5
6
0
0
0
n
0
0
0
*l
0
0
0
0
0
0
a
0
o
0
0
0
0
0
0
0
0
0
0
0
0
0
0
368
TABLE.-.E-S Do NOT USE/. PERMANENT FLAGS,AND-COUNTERS
.:-. FOR A/, B/ AND C SIGNALS FOR COMPUTER ,C
ORIGINAL PAGE IS
OF POOR QUALITY
TIME
• t C v>
• Oi.o
• w H w
. (..-oG
. ?v a !"'
.i^r
. 2 2 0
. j. 4 0
.IoJ
.J.bft
,2vu
.22:0
.240
.26."*
• Z 1 0
.3' 0
, 3 ^C
.b 40
* 3eO
. 3 e«.»
.4C-v
.WJ
. i 4 0
.46'1
.4 tJ
.500
• i t >.
. 5 'i 0
t- . r*
• - *J \'
.580
.(;C.-v
. t2y
.6^-0
• fioi,;
. fciio
.7PO
.720
.7^^
. 7 6 C-
.7cH
.81 u
.bZi
.8^.0
• bfcO
. 65C
.900
.WJ
.9^0
.S6U
.98V
l .CGO "
SENSOR A
.'.'O^A
.7«i3l
. 5 9 ^ 5
. - jc ' /b
-. >i,79
-.566-0
-.^511
- . tCo?
.^067
.9^11
.6660
.?v.79
-. j57d
-. '39t5
-.7^.31
-. joOC
.7^.31
.9945
.i>S76
-.2U79
-.866^
-.9i21
- . 4 0 o 7
.4067
.4511
,H.6oO
.2^79
- .5b7b
-.9945
-.7431
- .OCoC
.7431
. 9945
. b 1 7 o
- .2079
-.3660
-.9511
-.4067
.4C67
.•yf^i .
,8t6j
.2079
-.3878
-.9943
-.7431
-.OCUO
. 7431
.9945
.2b7fl
- .2C79
-.3663
SELECfED
SIGNAL
.CCCO
.7431
.9945
. 5 8 7 6
- .2C79
-. o66f»
-.95,-il
- .4067
.4067
. 9 5 i I
.8660
.2179
- .5678
-.9945
-.7431
-. OtoO
.7431
.9945
.5873
-.2079
• - .?66C
-.9:^11
-.4C67
.4Cr.7
• 9Mx
. Ofc60 .
.2C79
- .5b78
-,9Si5
-.7431
-.0000
. 7431
.9945
.5176
- . 20 /9
-.866P
-.9&11
-.4067
. .4C67
.9511
, 8cc^
.2C79
- .5878
-.9945
-.7431
-.0000
.7431
.9945
.5573
-.2G79
-.866^
INPUT
TO FILTER
.ecoij
.7431
. 9 ^ 4 5
. '.'; o 7 5
-.2079
-.6660
-.9t>il
-.4L-67
.4067
.9511
. 060 0
.2079
-.5876
-.v94j
-.7431
-.0000
.7431
.99-45
.3d? a
-.207*
-.8660
-.9511
-,40t>7
.4067
.9511
.6660
.2079
- .587a
-.9745
-.7431
-.0000
.7431
.9945
.5o76
-.2079
-. 6660
-.9511
-.4067
.4067
.V511
. 666u
.2079
-. 587 b
-.9943
-.7431
-.0000
.7431
.9945
.587b
-.2079
— . 0660
OUTPUT
FILTER
.3331
.2705
. .8303
.9355
. J 0 / 'y
-.5115
-.9736
-.799*
-.09o9
.67il
.9948
.6599
-.1115
-.d091
-.9713
-.4906
.3146
.9117
.9056
.3001
-.5039
-.9745
-.6002
-.0964.
.6712
.9946
.6599
-.1115
-.6091
-.9713"
-.4906
.3146
.9117
.9)56
.3001
- .5339
-.5745
-.8002
-.0964
.6712
.9945
.6599
-.1115
-.8091
-.9713
-.4903
.3146
.9117
.9056
.3001
- .5039
OF
ORIG.--V. :uG£.f3TABLE:E-9 INPUT AND OUTPUT TOR-COMPUTER A?
OF P008 QUALITY
'SELECTED'!..INPUT TO ' OUTPUT-JDiF
:TIME
' »•&&':>
.f2;.>
.(, 4C
.loJ
.(' 3C
.100
. I 2 v
.lit
. lof
.IcO
,2<* >
. i_ 2; u
.civ
. i C- s't
t ?. £ 0
.j&O
.3>0
. ?. ^  1
.2 £:0
• i rj (.
.10U
. 1 e. 0
,1sG
.16C
. 1 b c
. L>!. "
. ! ZO
.510
. !:- o '"'
.560
.6 CO
. o 2 u
.610
. 1 6 C
• t <5 v*
. 7C 0
. 7 z •:.»
. 7n 0
.760
.'i e-0
. t £ J
.620
. clO
. boO
. c g o
. 9 r» v
.920
. 91 J
.V60
.960
1 .C01
SENSOR B:
.0000
. "iC-00
• ujOOf1
• L-t. uo
.JUOO
-. 3o60
-.9511
-.1G67
.1067
.V511
. B 6 6 1
,2^79
-.:•>&? 6
-.9915
-.7131
-.0000
, / 1 3 1
.9915
. 5b7o
-.2079
- .6^60
— .Oh"! '
-.'1067
.1067
.9»li
.S66C
.2079
- . i; 6 7 6
-.9915
-.7131
-.IVJOO
.7131
.9915
. 'Jb7B
- .2079
- .0 66C
-.9511
-. 1i o 7
.106 7
.9511
. SboO
.2P79
-.5678
-.9915
-.7131
-.ycuo-
.7131
.9915
.5676
-.2079
' -.3660
SIGNAL
.ocoo
. OOGO
.;>CfO
.DDbT
. O C O O
-.6660
-.95-11
-.1067
.H67
.91?11
.6660
.2C79
-,5b78
-.9915
-.7131
-.OCCO
.7131
.9915
.5675
-.2L79
- . S f c ^ C
•- . 9 £ U
-,10t7
.1067
.9512
.6660
.2C79
- . 2 b 7 6
-.991;
-.7131
- . 0 C 'j J
.7131
.9915
.5 b? o
- .2C79
-. 8660
-.9£il
-.1067
.1067
.9511
.8660
.2C79
-.5 fa 73
-.9915
-.7131
-.0000
.7131
.9915
.5673
-.2079
- . 6 6 60
FILTER
,(?OV,C
.OuOO
• OOOc
. 0000
. SOwO
-.8660
-.95il
-.t067
.1G67
.9511
. bb6-j
.20.79
- .5573
-.9915
-.7131
— . 0 0 0 \j
.7131
.9915
. 587o
-.2079
— . 6 6 6 0
-.9511
-,1-w6/
.1067
.9511
. 6560
.2079
— . 5 5 7 8
-.9915
-.7131
-. cooo
.7131
.9915
. ipo78
- . 2 0 7 9
-.8660
-.9511
-. 106 /
.1067
.9511
. 6660
.2079
-.5678
-.9913
-.7131
-.0000
.7131
.9915
. .5678
-.2079
-.5660
. FILTERS' ;
.0000
.OJOJ
.0503
.0000
.OJOf!
-.5115
-.9738
-.7v9D
-.Q9o9
.o7il
.9918
.6599
-.1115
-.8091
-.9713
-.1906
.3116
.9117
.9J56
.3001
- .5039
-.9715
-.60',) 2
-.0961
.6712
,991?j
.6599
-.ills
-.o.)91
-.9713
-.190o
.3116
.9117
.9J)3fa
.3001
-.5339
-.9715
-.6002
-.0961
.6712
.9916
.6599
-.1115
-.8091
-.9713
-.190b
.31^3
.9117
.9056
.3001
-.5039
?7o 370
TABLEi-E-10 .INPUT AND-OUTPUT F©R. COMPUTER .B
.TIME
S-iHrtr-
.?2J
.343
.••'t..:t •
• ('• b ^
• .100
.12 0
• 14C
. I C'.i
. ] 60
• 2i'i}
. c 2. 0
.2tP
. 2o()
.260
. 3(-0
. 2 £U
.340
.360
.?b£
.400
• 4 2 0
• < r 4 C
. 463
.460
.500
.52x1
,;-40
.^63
.560
.600
. *• > A
• v.' C. •-,-
.640
.660
.66U
.760
.720
» 74v
.760
,7cr.
. P O O
.1320
. t40
.860
.88U
.9 CD
.9<^j
•"'.^'0" ~
.960
.560
"I70T5T"
•SEN.SJD.EUC
. .^c, os.
.ocf.o
.0000
. ..iV-'JO
.JG^
. 0000
. O C t ? 0
..^0.00
..'^ '"-^
.,•000
.d66H
.^a?9
. - . 5678
-.99^*5
• -,7<i31
-.oouu
. 7 <» 3 1
.••?9^5
. 3876
- .2079
- . 8 6 6 C
-.3511
- . <t 0 6 7
. ^ t067
. 9i>l. I
. 6 c 1 0
.2079
- . -j b I b
- .99<t5
-.7V31
-. ^OOD
.7^.31
.9945
.3676
- .2079
' "-.o6fc&"
-.9511
-.*C67
.40o7
,951i
.3660
.2u79:
- ,587b
-.9945
"~-~.7<r3i"~
-.0000
.7431
~ " .91*45
• 3 c; 7 b
-.2079
-. 86"6(>~"
| SELECT ED
i SIGNAL ..
.0000
.00(0
.00 CO
. otoc
• OGOO
.0000
.0000
.0000
.oc-oo
. or cr
.8660
.2079
- . b 6 7 6
- .9945
-.7431
-.0000
.7431
.9941?
.5? 78
-.2079
- . 3 6 on
-.9^11
- - .4067
.4067
. 9 i> i 1
.Bt fcO
.2C79
-.i67?.
- . 9 9 4 i
-.7431
-.0000
.7431
" .9945
. 5 6 7 8
-.2079
'-.eocTO
'-.9511
-.4067
.4067
.9513
.6660
~ .2(;79
- .5673
-.9945
~--.7\31 "
-.OOuO
.7421
'" ' .9945 "•"
.5878
-.2079
-.8't'bD""" ~
INPUT TO ':
FILTER j
. O O O w
. CwL - o
.0000
.0000
.OCOw
.0 JOG
. 000-w
.0000
.0000
• j 0 0 j
. 6660
.2079
-.3876
-.9943
-.7431
-. ywCu
.7431
.9945
.5878
-. 2079
-.666;)
-. 9511
-,4Uo7
.4067
.9311
• 8660
.20 79
- .5878
- .9943
-.7431
-.0000
.7431
. 9 9 4 5 '
. 5 8 7 8
-.2079
— . 8 66 >/
-.9511
- . 4 w 6 1
.4067
.9511
.6660
.2079
-.5873
-.9943
-.7431
-.0000
.7431
r994:s
.5(376
-.2079
"- • 8"6~b~£.r
OUTPUT OF
FILLER..
. 0000
.0000'
.3300
.0000
. .0000
.0000
• COOP
.00^0
.0000
.0300
.9948
.6599
-.1115
-•8<m
-.9713
-.4906
.3146
.9117
• 9tf56
.3001
- .3039
- .9745
-,<3u02
-.0964
.6712
.994.6
. 6599
-.1115
-.8D91
-.9713
-.4906
.3146
.9117
.9056
.3001
-.5039
-.9745
-.a;) 02
-.0964
.c-712
.9946
.6599
-.1115
-.6091
-.9713
-.4908
.3146
.9117
.9356
.3001
~~-,5GH
VE-11 INPUT AND OUTPUT,FOR.COMPUTER G
371
Single Power Fault
Table E-12 shows that the A computer experienced a power fault from 0.40 to 0. 58
that caused power loss. At 0.60 power to A comes back on and recovery proceeds.
Table E-13 indicates that as expected the A computer does not synchronize and the
s
system goes duplex with B and C.
Tables Er14 and E-15 indicate that at 0. 60 the A computer needs recovery based on the
fact that it has just synchronized with the others. Table E-16 indicates A requiring
power-on recovery at 0.60.
Tables E-17, E-18, and E-19 show that at 0.60 A's permanent flag is released and its
do-not-use flag is set. After d time Tl A is again accepted as good and the system
is again triplex.
Tables E-20, E-21, and E-22 show the inputs and outputs. Again note that the recovery
does not produce any problems.
372
. . . .POWER.
POWER
TIME
-.0000
.0200
.0400
.0600
-.oefto
. 1.000
.1200
.1400
.160.J
.1.800
.2<VH>
.?200
.?<»•).:»
.2600
.2800
.3000
.3200
.3^ 00
.3600
,?,80-'»
.4000
.4200
.4400
.4H-UI'
. 4 J? 0 0
.5000 -
.5200
.5400
.5600
.5800
.6000
.6200
.*4GO
.6600
.6800
.7000
.7200
.7<,00
.7600
.7POO
.8000
.82UO
.8400
.8600
. H800
. 9000
.9200
.9400
.Q600
.9800
1.0000
.A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
P
F
F
P;
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
t
B
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
.T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T'
T
T
T
T
T . •
T
T
T
T
C
P
F
F
F
F
F
P
P
F
F
T
T
T
r
T
T
r
T
1
T
T
T
T
T
T •
T
T .
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
WATCH-DOG
A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
F
F
F
• F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
B
,F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
t
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
C
F
F
F
F
F
F
F
F
F
F .
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
INTERUPT
A
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
B
F
F
F
F
F
T
F
F
F
' F
F
F
F
F
F
' F
F
F
F
F
.F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
C
F
C
F
F
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
ITERATION
RESET
.A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
• F
F
F
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
f
T
T
""t
T
T
' " T "
B
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
C
F
F
F
F
F
F
F
F
F
F.
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
t
TABLE:E-12 SIMULATED HARDWARE INDICATORS
TIME
.-ooo
,G?.u
.040
.060
.080
.If;*)
.12u •
.140
.160
.160
.200
.220
. 2 <• U
.260
.280
.300
.32')
. 1 4 0
.260
.3*0
.400
.420
.440
.4^0
. 4 6 ,>
. •500
.520
.^40
.^60
. •?«&
.600
,6?0
.640
.t)6C
.680
.700
.720
.740
.760
.780
.800
.820
.8*0
.R60
.880
.900
.920
.940
.PbO
.980
1.000
A
o •
<.
^4
<V
6
6
IS
t>
6
7
7
7
7
7
7
7
7
7
7
o
0
;')
0
/\
o
0
0
0
0
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
B
0
0
0
0
0
h
fj
(S
6
ft
7
7
7
7
7
7
7
7
7
7
.3
3
3
3
3
3
3
3
3
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
C
"0
0
0
0
0
0
0
o
0
0
7
7
7
7
7
7
7
7
7
7
3
3
3
3
3
- 3
3
3
3
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
" 7
TABLE.:E.-13 SYNCHRONIZATION INDICATORS
374
A.TIME
•.t'vtr
.020
. 0 4 0
.060
. 090
. i^>(J
.120
.1*0
.IbO
.IbO •
.200
,220
. 2 *» 0
.260
.280
.300
.320
.3*0
.360
.360
.^00
.420
.4.4.O
. '-tbO
. ^ <Q > J
.500
.5?0
.5*0
.560
. 5 80
.600
.620
. 6 4 Q
.660
. 6 ^  0
• 70'"'
.720
.7^0
.760
.780
.800
.620
.840
.860
.R80
.900
.920
."9*0
.960
.980
1.000
-1
1
o
f*l
0
t.
o
,J
0
"j
3
"i
r>
0
o"
o
(I
:i
f!
0
-1
-1
-•J
— 1
-1
~ 1
-1
-1
_ 1
-1
1
r>
0
n
0
0
0
0
0
i;
o
0(•)
o
0
0
0
0
0
n
0
A
o
o
0
0
0
o
0
o
0
;)
0
0
0
o
0
0
0
0
0
0
o
0
i)
0
;i
0
0
0
,'J
0
(.)
6
0
'»
0
n
0
ij
0
T *
(>
v>
0
0
0
0
0
0
0
n
0
B
0
0
o
t s
V
o
fl
0
o(.
0
0
l'l
0
0
0
0
f\
0
o
0
0
0
c.
0
C'
0
0
0
0
0
0
0
0
0
0
;>
0
o
0
0
0.
0
0
0
0
0
0
0
o
!J
o
c
"o
o
0
0
0
0
0
0
0
fl
1
1
0
()
0
0
0
0
0
0
o
0
0
0
0
0
o
f\
•<j
0
0
0
0
0
0
0
0
0
0
o
0
0
0
0
0
0
0
0
0
o
0
TABLE:E-14 RECOVERY INDICATORS FOR COMPUTER A
T I M E '
:
 .OO'O
.020
.0*0
.060
, u f ? 0
. lU'J
. 1 2 f
.1*0
. 1.60
. I P O
.2 On
. 220
.2*0
.260
. 2 H D
.300
.320
. 3 '< 0
.360
.3H 0
.*00
.*2U
. < t < - 0
.**0
. * H 0
.500
.520
.5*0
.560
.?80
.600
.620
.6*0
.660
.680
.700
.72?
.7*0
.760
.780
. R O O
.820
.8*0
.860
.880
.900
.920
.9*0
.960
.980
1.000
x-
-1
-i
-\
-1
-I
1
('!
0
o
0
*
0
0
>'!
fi
0
o
0
0
n
0
o
0
0
n
0 ,
u
0
n
0
3
0
0
0
0
'•]
li
0
o
o
0
0
f.l
o
0
0
0
a
0
o
o :
.A
y"
0
•i
;.i
o
0
0
0
•V
0
0
0
0
,">
0
0
n
0
0 •
A
0
0
0
f\
0
G
0
0
0
0
0
0
0
0
0
'0
\)
0
o
0
0
5
u
0
o
0
0
0
0
0
0
B
0
0
0
0
0
0
0
0
0
0
0
0
c
0
0
0
0
o
0
o
0
0
0
0'
o
0
0
0
0
0
c
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
0.
0
0
0
c
o
0
0
0
0
0
0
0
0
0
I
I
0
f\
0
0
0
f.l
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
o
0
0
0
0
0
0
0
0
0
0
0
0
0
0
376
TABLE:E-15 RECOVERY INDICATORS FOR COMPUTER B
ORIGINAL PAGE SS
OF POOR QUALITY
o.TIME:
-. f-K'rO'
.020
.0*0
.060-
.080
.100
. 1. 2 3
.1*0
.160
.180
.200
.220
.2*0
.260
.260
.300
.320
.3*0
.360
.380
.400
' .420
. 440
.460
. 48y
.500
.520
.5*0
.560 ,
' ' . 580" "
.600
.620
.6iO
.660
.680
.700"
.720
.7*0
.760
.780
.800
" .820 ~
.b*C
. 8 6 0_ _
.880
.900
.920
.940
.960
.980
1 •"'VOO'
^- '
 L
-i
-i
-i
-i
-i
-i
-i
-i
-3
-1
i
'"*
0
y
0
0
0
i.)
0
0
0
0
0
0
0
0
0
0
o
0
4
f)
0
0
0
" "fj
0
0
o
0
0
__
0
0'
0
0
0
o
0
00 " " '
A
o" .'
0
o
0
0
o "
0
0
0
o
0
0
0
0
0
o
o
0 '
0
0
0
0
0
0
u
o
0
0
0
6 "
0
0
0
0
0
0
0
0
0
0
fl
o~~
0
•o
6 "
0
0
o
0
00 ""
B
0
0
0
0
0
?1
o
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
.0
0
0
0
0
o
0
o "
0
0
0
0
o
0
o
0
0
0
0
" 0
0
0
0
0
0
o
0
0
0
c
0
0
0
0
0
o
0
0
0
o
1
1
0
0
0
0
0
0
0
o
0
0
0
0
o
0
0
0
0
"b
0
o
0
0
0
"6"
0
0
0
0
0
0
0
0
0
0
0
c
0
o
0
TABLE: E-16 RECOVERY INDICATORS FOR COMPUTER t
ORJGIWAL PAGE 53
OF POOR QUALITY
Tl COUNT T2 COUNT
. T I M E 1
. u *,><'-
.020
.040
.060
.080
.100
.120
.140
• I6u
.180
.200
. 2?0
.240
.260
.280
.300
.320
.340
.360
.380
.400
.42.0
.440
.460
.480
* . ' V 'j1
.520
.540
.560
.560
.60-)
.620
.640
.660
.680
.700"
. 7 ?.. :'}
.740
.760
.780
.600
.820
.84j
.860
. 880
.900
.920
.940
.960
.980
1.000
A
n
o
0
0
0
0
0
0
0
0
0
o
0
•J
0
0
0
0
0
o
0
0
0
n
(\
)
0
0
o
0
1
1
1
1
1
1
o
0
0
0
A
0
0
0
iji
0
0
ft
0
0
0
B
0
0
o
0
0
1
1
.1
i
1
1
0
0
f'l
0
0
0
0
0
0
0
0
0
0
0
fj
0
0
0
0
0
f j
0
0
u
6
0
0
o
0
0
0
6
0
0
0
0
~0
/•»
0
'0
c
"6
o
0
0
0
0
0
0
0
0
0
0
1
.1
I
•1J.
1
\
G
!)
0
Q
0
o
0
0
0
0
0
0
0
\J
0
0
0
0
ij
0
0
0
0
0
.0
0
0
0
0
0
0
0
0
A
0
0
0
0
0
o
t!
0
0
0
0
0
0
tj
0
0
0
f.l
0
0
0
o
0
0
0
0-
0
0
0
0
o
0
0
0
Ci
0
0
0
fj
0
0
0
o
0
0
0
0
"0
0
0
0
B
0
1
1
.1
1
0
c
0
0
0
o
0
0
0
0
0
0
0
0
o
0
o
0
u
0
;)
o
0
0
0
0
J
0
0
0
'0
o
0
0
0
b
0
0
0
0
0
0
0
0
0
0
cdj
1
1
1
1
1
1
1
1
1
1
0
o
0
0
0
0
o •
0
0
o
0
n
0
0
0
0
0
0
a
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0"
0
0
0
A -
b
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
u
0
0
0
0
0
0
1
2
3
4
t)
b
0
0
0
0
0
0
0
0
0
0
0
" .0
0
0
"0
B
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
,;
0
0
0
0
0
0
r»
0
0
0
*J
0
0
0
0
0
y
0
J
0
0
0
0
0
0
0
0
0
0
o
0
0
0
c
0
0
0
0
0
0
0
0
.0
0
0
0
1
2
3
<t
5
b
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
A.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
o
0
0
0
0
0
'0
0
0
0
a
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
"0
0
0
0
0
0
0
B
0 "
0
D
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
o
0
0
0
•J
0
0
. 0
0
0
0
0
0
0
0
D
J
0
0
0
0
0
0
0
0
0
0
0
0
0
0
c.
0
0
D
0
0
0
0
0
0
0
0
0
1
2
3
4
3
6
0
G
0
D
0
0
0
0
c
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
TABLE ;E-17 Do-NOT USE, .PERMANENT .FLAGS •JWD^CO.UNTERS.
:. _... vFOR r A,, B,. AND;,C SIGNALS.,FOR. COMRUT.ER.-.A"
378 ~ ' ' • ": "".
<? ,<r
. TIME A B C A B C
.0 <">•,) •') 0 C U '.' 0
.^20 0 0 0 0 0 CJ
.040 0 0 0 0- 0 0
.160 O O O O O O
• O B O 0 0 0 0 0 0
.100 0 1 ij 0 0 1
. 1<~'J ft 1 0 o 0 1
.140 0 1 0 0 0 1
.160 0 1 0 C ,0 1
.180 0 1 0 0 0 1
.20-) 0 1 (• 0 0 1
.220 0 •» 0 v "I 1
.240 ') -J 1 0 0 >
.260 0 0 1 0 0 0
. 2 R O y 0 1 00 0
.300 0 0 1 0 0 0
.320 0 0 1 0 0 0
.340 -j •:> i o u 0
.360 -j 0 :• n o ;?
. 3BO 0 0 0 0 0 0
.400 0 001 0 0
. 4 ?_ 'j 0 0 u 1 0 ^
. 4 4 Q 0 0 0 1 0 0
.460 J 0 0 1 U 0,
,4v;.;o 0 0 0 1 0 0
. 5 U ;.i 0 0 0 1 0 0
. 520 0 0 0 1 0 0
.•HO o j o i o n
. 5 fcO 0 0 0 1 0 0
.580 0 0 u 1 0 0
.600 1 0 0 0 0 0
• ^20 .). ;> o o o o
.640 1 0 0 0 0 0
.6*0 1 0 0 0 0 0
.680 100 0 U 0
.700 1 0 0 0 y 0
.720 0 0 0 0 0 0
.740 0 0 0 0 0 0
.760 0 i) 0 0-0 J
.760 O O O O O O
.ROO f» o o o a o
.820 O O O O O O
. *4'J U U 0 0 0 0
.660 O O O O O O
.880 0 0 U 0 0 U
.900 O O O O O O
.920 " J 0 0 0 0 0
.940 0 0 0" 0" 0 0
.960 0 0 0 0 0 0
. 9RQ y o o o o o
i.ooo o o o o o o
Tl.
A. -
0
n
0
o
0
0
0
o
0
0
0
0
o
0
0
0
0
(,
c\
A
V
0
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
j
n
f.
0
0 .
0
0
0
0
0
0
0
0
0
0
COUNT
B
0
o
0
0
o
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
- o
o
0
0
0
0
a
0j
o
0
0
0
0
0
J)
0
0
0
o
0
0
0
0
0
0
'.)
0
0
o
C
0
0
0
0
0
0
'J
0
•0
0
0
0
1
2
3
<t
5
f,
0
0
0
0
0
0
0
0
0
u
0
0
0
0
0
0
0
0
0
0
u
0
0
0
0
0
0
0
0
0
0
0
0
T2 COUNT
A
0
0
o
0.
0
0
0
0
0
0
0
1,1
u
0
0
o
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
\J
0
0
0
0
0
0
0
0
B
a
0
0
0
0
1
2
3
4
5
6
0
a
;>
0
0
0
,J
' 0
0
0
o
;>
1
0
0
0
0
0
0
0
o
0
0
•)
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0 .
C
C'
0
0
0
0
0
0
u
0
0
0
0
1
2
3
4.
5
6
0
0
0
0
0
0
0
0
0
0
o
u
0
0
0
0
0
0
0
0
C
0
0
0
0
0
0
0
u
0
0
0
0
TABLE:E-18 Do NOT.USE, PERMANENT-.FLAGS,,AN.&.COUNTERS
.FOR A, B...AND.C SlGNALS;..F.OR.-.Co,MRUTER-B
£ /£ / • • • • • •
«§> *
TIME
.000
.020
.040
.060
.«)«0
.K'O
. 1 2 U
.140
.160
.1*0
.20-}
,2?0
.34-L>
.?60
.280
.300
.320
. 3 4 0
. 3 6 U
.360
,4'TO
.4.?;)
.4^,0
.4.60
.4 BO
.500
.520
.540
.5f 0
. s 8 'J
.600
.620
.640
.660
.680
.70J
.7^0
.740
.760
.780
.800
.820
. H 4 0
.860
.880
.900
.920
.940
.Q«50
.980
1 . 000
A
i.i
0
0
0
,)
.)
0
0
i;
0
0
0
u
0
;i
0
0
r»
/..
0
0
'}
-,)
0
0
o
0
0
0
0
1
1
1
'L
1
1
o
0
'1
•)
0
0
0
0
o
0
0
0
o
0
0
B
0
0
o
o
•J
.)
0
0
0
0
1
0
0
0
0
0
0
o
o
0
n
V
:)
U
0
0
u
0
0
0
J
0
0
0
0
0
•J
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
^^^
c'
0
u
0
0
0
V
0
o
)
0
•J
0
i
1
i
1
1
i
'•j
0
0
0
/"*
> >
f\\J
0
'•'.•
0
0
0(',
0
i;
0
**•
V
0
0
0
0
0
0
0
0
0
0
0
0
0
o
0
0
n
•^^ M
A
0
•')
0
0
y
,'!
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
i
I
I
I
^
1
1
1
1
0
u
0
o
0
0
o
0
')
0
0
0
o
0
0
0
0
0
0
0
0
B
0
0
0
0
fl
•J
o
0
0
o.
0
0
o
0
0
0
0
0
0
0
0
u
0
0
u
o
0
{1
0
0
0
0
0
(1
0
;•
0
0
0
•)
0
0
0
u
0
0
0
0
0
0
0
•^v
c
0
0
0
0
o
J
0
0
0
0
1
1
0
0
0
o
0
0
0
0
0
v'
u
0
0
ij
0
0
0
0
0
0
0
0
0
0
0
0
0
t)
0
0
0
0
0
0
0
0
0
0
0
Tl COUNT
A
u
0
0
0
0
0
0
0
0
0
0
0
0
0
0
o
0
0
0
0
0
(;
0
0
0
- u
0
•0
0
(.1
1
2
3
^
5
.6
0
0
r*\s
0
0
0
0
0
0
0
0
0
0
0
0
B
0
0
I)
0
0
0
o
0
>j
0
b
0
0
0
0
0 -
0
0
0
0
0
0
0
0
0
0
0
o
0
o
0
0
0
0
0
0
0
o
0
0
0
0
0
0
0
0
0
o
0
0
0
c
0
0
u
0
0
0
0
0
0
0
0
0
1
a
3
*t
5
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
u
0
0
0
0
0
o
0
0
T2 COUNT
A
0
0
0
0
0
0
0
0
0.
0
u
0
0
0
0
0
0
0
0
0
0
0
0
t
0
0
0
0
0
0
1
2
3
4.
5
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
B
o'
0
'J
0
0
0)
0
D
5
6
0
0
>
0
0
0
0
0
? •.,.,
0
J
0
0
0
J
C'
J
0
a
0(.)
0
0
0
0
0
0
0
0
0
0
0.
0
0
0)
y
0
0
0
c
0
0
o
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
TABLE:£-19 Do NOT USE/ PERMANENT. FLAGS AND COUNTERS
. FOR A, B/ AND .C SIGNALS FOR: COMPUTER C
,380
TIME
.•Ki-.j
.0?0
.'l 40
.060
. Ci 8 '»
.100
. 1 2 ?
.!<••.'
.It,;
. 1 ? J
,?0''>
. ? '. ''!
. ? <• o
. £' :'
. ?«.;
. 3 '.' u
.3?U
.34'!
.3' •»
.3?0
i .^ 00
:
 .42 O
. *4-i
;
 . 4 ft :")
.4*,,
.'-00
• ?>?'>
.540
. 56«'»
.^P;»
• 6-'.<u
.620
. 6 4 0
.66;.J
.680
.700
.720-
.7*ii
.760
. 7 8 •'.>
.eon
.M2.n
. fl ^ 0
- .86-^
.^ftsl
.900
.920
.M40
.960
.9 BO
l.n-j j
SENSOR A
.-;•'.:•').:'»
.12s! 3
.248?
.3*Si -
.4«1H
.5878
.6 H^'i
. 7 70 5
. 84*3
. 9 0* o
. .0*51. I
.
c
, y?3
,^gp(j
.99^ .')
.9^23
.^51!
. Q()48
.84*3
. 7 70 5
.t-H*S
.0:)00'
.0000
..'••
r
'00
.0000
.-vy-D
.0000
.0000
.0000
.O'JUO
.0000
-. 5H7^
-.6845
-.7705
-.P443
-.9048
-.9 -I1. 1
-.9823
-.99Hy
-.9980
-.9823
-.951V
-.9. '.'1*8
-.8443
-.7705
-.6846
-.5«7R
-.4818
-.3681
-.2*R7
-.1253
.0000
SELECTED
SIGNAL
• O^ou
.1253
.2487
.3o8l
.4818
.^^78
.6«45
• 77J*.
. .«443
.90*8
.9511
.9823
. 9 y 8 u
,99«0
.9823
.9511
.9048
.8443
.7705
.6 "45
.OOOO
.0000
.OO..H'
.0000
.(.-;00
.0000
.0000
.0000
.0000
.0000
- . S 8 7 6
-.6845
-.7705
-.8^43
-.9048
-.9511
-.9823
-.9980
-.99BO
-.9823
-.9511
-,9».-.*b
-.8443
-.7705
-.6845
-.5R78
-.4818
-.3681
-.2487
-.1253
.0000
INPUT TO
FILTER
•,oc> yo
.1253
.2*87
.3681
.4818
.5878
.6H45
.7705
.8443
.9046
.9511
.9823
.9980
.998i>
.9823
.^511
.9U48
.8443
.77.15
.6845
.'jt>00
.OO'JO
.'OuuO
• .0000
. v>0 ')w
.0000
.0000
.0000
.0000
.Oo JO
-.587H
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9980
-.98*3
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3631
-.2437
-.1253
.0000
OUTPUT
FILTER
.ay DO
. ') 456
.) 96
.3068
.4199
.5286
.6319
.7241
.8047
.8728
.9271
.9668
.9912
1. JUG'j
.9931
.9704
.9325
.8799
.8133
.7340
.OUOO
.OOUO
• UOOO
.0000
.3000
.0000
.;)ooo
• uQOO
.0000
.OJOO
-.5297
-.6318
-.7240
-.8047
-.3728
-.9271
-.9668
-.9912
-i.OOO'J
-.9931
-.9704
-.9325
-.8799
-.8133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
OF
TABLE:E-20 INPUT AND OUTPUT FOR;COMPUTER A 381
TIME
. 0 1.' °
. i>2-'»
= .040
.Oy.l
.10 J
.12' !
. 1 4-.*
. 1 1? v*
.180
* '<L >•'
,?20
. ? 4 '>
,?60
. 2>- 'i
. 3 V ~J
• 3 2 ">
. 3 4 0
. ? 6 '.'>
. 3 H ">
.400
. 4 2 -j
.440
. 4 f, 0
. 4£f)
. ''•>.'•'»
£:.>•»
. i— »•
. 5 <• ••';
.560
.580
. tj !.• j •
.620
• 6 4 { j
. 6tO
. 6 8 (.'
.
 7UO
.7? ,
.
 7
^U
. 76?>
• .780
.'TOO
. H ? ^
. "40
. 8 6 0
.^ 60
. 90'j
. 92 0
,Q4?
. Q h 0
.
 Q
 8 r>
1 .000
SENSOR B
.oono'
.0000
. OOQO
.0000
.5H7JH
. t- "4^
.77^5
. P44:i
.
 Q048
.9511.
. 98? 3
. 99HO
. 9980
.982?'
. ° * 1 J,
.904H
.8443
.7705
. 6 "4 S
'.5878
• *• 91 8
.36H1
.2487
.1^53
-. vi.i.K'1
- . 1 ? 5 3
-. 24* 7
-. 36* 1
-. 4F18
c o 7 o
^ . ' O * o
-. 6«4^
-.7705
-.8443
- . 9 0<» ?
- . ^  5 1 1
- .' 9 H ? 3
-.9980
-. 998J
-.9823
-.9511
- , 9 0 4 H
-.8443
-.7705
-.6845
-.5878
-.4818
-. 36»1
-.2487
- . 1. ? "5 3
.0000
SELECTED
SIGNAL
.00,00
.00 JO
.0000
.O-'IOO
.*5H78
.6845
.7705
.8443
.904M
. 9 5 J. 1
.9823
.998 <')
. 9^80
.9823
.9511
, 0 0 4 R
. 8 4 4 3
.7705
.6845
.5878
.4818
.3681
.2487
. 1 2^3
-. 0000
- . 1 2 5 ^
-.2467
-.3081
-.4818
-.5676
-.6845
-. 77ij5
-.8443
-.904b
-.9511
-.9823
-.9980
-.9981"'
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
.0000
INPUT TO
FILTERS •
.0000
. yOJO
.0000
.0000
. 5878
.6845
.7705
.8443
.9048
.9511.
.9823
.9980
.9980
.9W23
.9611
,904«
.6443
.7705
.6845
,5878
.4dl8
.3681
.2487
.1253
-.0000
-.1253
-.2487
-.3681
-.4818
-.5678
-.6845
-.7705
-.8443
-.9048
- . 9 5 i i
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
.0000
OUTPUT
FILTER
.UOCO
.0000
.uOO-3
.0000
.5286
.6319
.7241
.3J47
.8728
.9271
.9668
.9912
1 . jou J
.9931
.9704
.9325
.8799
.8133
.7340
.6431
.5423
.4324
.3160
,19<t6
.0701
- . ) 5 5 5
-.1802
-. 3 J21
-.4192
-.5297
-.6318
-.7240
-.8047
-.8728
-.9271
-.9668
-.9912
-1.0000
-.9931
-.9704
-.9325
-.8799
-.3133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
OF
382
TABLE:E-21.INPUT AND OUTPUT FOR-.CQMPUTER B
TIME
.000
.02d
.040
.'HO
. '"> P 0
,U-0
. 1 ? 0
.]>.•>
. 1 t 0
.180
.200
.?20
.2*0
.260
.260
. 3 0 0
.320
.3^0
.3*0
.386
.400
.420
.44.)
.460
,48n
.300
.520
.540
.560
.580
.6 JO
.620
.640
.660
.680
.7CO
.720
.7<tO
.760
.?eo
. eoo
.820
.340
.86')
.8*0
.900
.020
.940
.960
.960
1 . 000
.SENSOR C
.onoo
.0000
,(...»DO
.0000
.0000
.'JUDO
.0000
.0000
.0000
.0.190
.9511
.Q8.23.
,99flft
. Q q R 0
.9 8? 3
.9^11
.904 8
.844?
.7705
.£845
.5878
.4818
.1W1
.2487
,1?53
-.OOOO
-.1253
-.2487
-.3*81
-,<«918
-.^87R
-.6845
-.7705
-,t?443
. -.9048
-.9511
-.9823
-,Qq>>0
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
' .0000
SELECTED
SIGNAL .
.0000
.0000
.0000
• OOOu
.0000
.yOOO
.0000
• 00<V»
.0000
.0000
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.«443
.770:-
.6345
.5878
.4818
. .3681
.2487
- .1253
-.0000
-.1253
-.2487
-.3681
-.^818
-.5873
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.QU48
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2^87
-.1253
.0000
INPUT TO
-"FILTER
.0000
• OUOO
.0000
.0030
.0000
.0000
.0000
.0000
.0000
. oooo
.9511
.9.823
.9980
.99 SO
.9823
.9511
.9048
.8443
.7705
.6645
.5878
.4818
.3681
.2487
.1253
-.0000
-.1253
-.2487
-.3631
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9Q80
-.9823
-.9511
-.9U43
-.8443
-.7705 •
-.6845
-.5878
• -.4618
-.3681
-.2487
-.1253
.0000
OUTPUT OF
FILTER.
.0000
.0000
.0000
.0000
.0000
.0000
.ouoo
.0000
.0000
.0000
.9271
.9668
.9912
1.0000
.9931
.9704
.9325
.3799
.8133
.7340
.5431
.5420
.4324
.3160
.1946
.0701
-.0555
-.1802
-.3021
-.4192
-.5297
-.6318
-.7240
-.8047
-.8728
-.9271
-.9668
-.9912
-1.0000
-.9931
-.9704
-.9325
-.^1799
-.8133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
TABLEJ.E-22 'INPUT AND. OUTPUT-,FOR'-COMPUTER C
Double Power Fault
After attaining triplex operation the A computer experiences a power loss from 0.4
to 0.48 seconds. The C computer then has power loss from 0.6 to 0.68 seconds. This
can be seen by examining Table E-23.
The synchronization status for the system is shown in Table E-24. Tables E-25, E-26,
and E-27; show the recovery requests as seen by each computer. At 0. 02 computer A
comes on notes it is alone and proceeds in simplex. At 0.1 A sees B come up and releases
B's permanent flags and sets its do-not-use flags. At 0.2 A see C come up, but since
B has not completed recovery yet holds C failed until 0.22 when B finally recovers.
From 0.4 to 0.48 A's power is off. At 0. 5 A's power comes back on and A recovers to
C. At 0.7 A releases C for recovery again.
At 0.1 B comes on notes that A is operating and recovers to A. Then after recovering
B remains good during the entire run.
At 0. 2 C comes on and recovers to B, since B's do-not-use flags are still set, C does
not begin recover until B has completed recovery.
Tables E-28, E-29, and E-30 show the do-not-use flags permanent flags, Tl counter,
and T2 counter for A, B, and C respectively.
The inputs and outputs for A, B, and C are shown in Tables E-31, E-32, and E-33.
t)
384
, POWER
TIME
.000:1
.-:»2rtfl
.0400
.0600
.OH oo
. 1000
.1200
.1400
.160-.1
.IROO
.2000
.£?vO
.24CO
,?oOO
.3800
.30-JO
.3200
• 34vv>
.3600
.3HUO
.4000
.4200
.4400
.46CK)
.480-.I
.500(j
.5200
.5400
. 5600
.5800
.6000
.6200
.6400
.6600
• 69^ 0
.7000
.7200
.7400
.7600
• 78UO
.8000
.s?co
.8400
.8600
,880ft
,QOOO
.9200
.9400
.9600
,9«00
1.0000
A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
(•
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
POWER
B
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
"T
T
T
T
T
T
T
WATCH-DOG
C i
F
F
F
F
F
F
F
F
F
F
T
. T
T
T
T
T
T
T.
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F .
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
B
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
C
F
F
F
F
F
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
INTERUPT
A
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F .
F
F '
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
B
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
C
F
F
F
F
F
F
F
F
F
F
T
F
F
F
F
F
F :
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
ITERATION
RESET
A
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T .
T
T
T
T
T
T
T
T
T
T
T
t
T
T
T
B
F
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
t
T
T
T
T
T
T
T
T
T
T
r
T"
T
T
T
C
F
F
F
F
F
F
F
F
F
F
. T
T
r
T
T
T
' T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
T
T
T
T
T
T
T
T
T
T
T
T
.. ._„..
T
T
T
TABLE:E-23 SIMULATED HARDWARE INDICATORS
TIME
.000
.O??!
.040
.06''
.060
. J.uo
1 .120
I . 1 <* 0
i .16'.)
i . i°0
i .200
; . 2?-'>
. ?"*<.»
.260
. ?.$**
.30J
.3?0
; o /, A
• .? •* \>
.
3>;-
.3^0
, 4 < > < «
,4?0
.^J
. A f - O
.4P<Y
. 5 ^ ' v >
.?2J
.^40
.
C
' 6C>
. 5 t M
.600
.^c>;
.040
.660
.680
.?(.:.
.720
• 74.J
.760
.7PO
.800
^H20
, 8 ^ / s
, f t60
.880
,-9f-0
.920
.940
.960
.980
1.000
A
0
I.
^t
4
4-
b
h
ft
6
6
7
7
7
7
7
7
7 .
7f
7
7
V
0
' 0
0
o
7
7
7
7
7
6
6
b
6
6
7
7
7
7
7
7
. 7
7
7
7
7
7
7
7
7
7
B
0
/•>•
0
0
' !)
6
b
<S
h
6
7
7
7
7
7
7
7
7
— -f —
7
7
3
3
3
3
3
7
7
7
7
7
t>
ft •
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
C
0
0
0
0
0
0
0
.•)
0
0
7
7
7
7
7
7
. 7
7f —
7
7
3
?
3
3
•5
7 '
7
7
7
7
0
o
0
0
0
7
7.
7
7
7
7
7
7
7
7
7
7
7
7
7
7
TABLEDE-24 SYNCHRONIZATION INDICATORS
386
ATIME
. Or,;.!
% • • / ? ? !
.0^0
. 06'0
. 0 f1 0
.io;.1)
* 1 ? 0
. 1 4 ll
.1M1
. 1 p u
.200
, ??0
.?<*•!
. ?e>"
• 2 r •".«
.300
. ? 2 J
. 3<-0
. 3 £ C
, 3«0
, 4 < ^ >
.*«2u
. 440
. 4 f- „>
, 4 P O
• * f • •'*
.520
» 54 o
.?60
. *> ? 0
.600
. b 2 U
,64u
. 6^0
. f H n
,7,io
.
 7
 2 0
.740
.760
.7P.fi
. "Oil
* 820
.8*,»
.fit.''-
.860
.900
.920
,94.j
.^bO
•
 Q
 1 0
.000
-1
V
0
0
i..
<«
«'•
n
0
o
• 3
0
•j
0
n
I-
0
'!
\)
0
-1
-1
-1
-1
-1
1
0
0
0
n
n
0
(i
0
c-
3
o
o
o
•T
0
0
0
o
0
0
o
0
0
ft
0
f^
0
1
1
fl
0
0
•1
u
0
J
rv
o
i -'
0
0
0
0
. :t
o
0
0
0
0
ij
0
T
0
0
^>
o
0
n
0
0
n
0
0
0
0
(.1
0
0
0
o
0
0
n
0
0
0
0
B
0
t
o
o
0
0
• >
-•)
0
0
0
0
o
0
c>
0
0
0
0
0
0
0
0
;>
0
>"*
0
0
0
f:
•J
r>
0
0
0
G
0
0
0
0
0
o
0
0
0
0
0
0
0
0
0
c
0
0
0
c
0
f >
0
0
0
0
1
J
o
0
0
0
u
c
u
o
0
0
0
f,
0
\>
0
o
0
o
0
0
0
0
o
0
0
0
0
0
0
o
0
o
0
0
0
0
0
0
0
TABLE:E-25 RECOVERY INDICATORS.FOR COMPUTER A
387
TIME
.'••rui
. 02o
.0*0
,0^0
• *'£ l'>
. 1 0 C>
. i 2 "
. l * n
..H-O
. '1. '? : »
.20'.'
,2^ r>
.2*0
. 2 n 'j
. 2 V 0
. 3</0
.320
. 3 * >">
. 3 6 0
.3*0
. *0y
. *2u
.**•.>
,*60
. * fc ..»
. 5 0 J
.520
. 5 * n
. ?6'J
.5 HO .
.iSuH
• '; 2 0
. t * U
. fcbO
• 6F'0
. 7^0
.720
. 7**..'
. 760
. 7 £ n
.ROO
'• fl?0
.8*0
. H 6 C>
• 8 t fO
.900
.920
.9*0
.960
.9^0
.000
-1
-1
-1
-'I '
• -1
1
|:
(.:-
o
0
4
;"•
o
o
';
,i
0
n
o
0
,'i
0
J
0
o
3
;j
O
^
0
0
rt
C)
0
n
4
0
'!
0
0
0
0
0
•'j
0
rs
u
0
0
0
o
0
A
o
0
' v.'
o
0
••)
.•)
'.)
o
V*
0
n
0
o
0
0
G
o
o
0
0
0
0
0
)
0
0
0
0
u
0
0
o
0
0
0
0
•J
0
n
o •
0
0
0
0
o
0
0
<"!
0
o
B
0
0
i)
0
C',
0
0
t)
0
0
0
. o
0
\
u
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
f.»
0
0
o
0
0
0
0
0
u
0
u
0
0
0
0
u
0
0
c
0
0
0
0
0
o
0
0
0
o
1
1
0
o
0
0
0
0
0
0
n
0
o
0
0
n
0
0
0
r,
0
0
c
0
0
,1
0
0
0
0
0
0
o
0
0
0
0
0
0
0
0
388 TABLE:E-26 RECOVERY INDICATORS FOR COMPUTER B
TIME
.000
.020
.040
.060
.080
; .10-)
.120i
.14,)
.160
.180
.200
• ??0
.2*0
.2*0
.280
.300
• 3 2 •">
.340
' .36 U
. 3 f < 0
. ^ 00
.
&2u
.440
.4*0
.480
, 5 (^ ') _
I .520
i . 54i'f
.560
.5*0
.650
. 6 2 o
.'•40
.660
. 6 P O
. 7u-.j
. 7^0
.740
. ?M,»
. 7 P O
. Hi." 0
. 8 ? n
.£<»! .<
. P 6 •>
.8*0
. 90''«
Q P ,"\
O £l ' I
.^to
, Q f c M
l . G C O
-i
-1
-1
-1
-1
-1
-1
-1
-1
-1
1
0
(•
0
•J
0
0
.')
o
n
0
f\
0
;')
D
4
0
c
0
l i
-1
- 1
^ -I
-1
-i
i
o
0
n
0
•v?
o
•J
'j
0
f;
0
i l
0
o
0
A
o
o
0
o
0
0()
u
0
0
0
0
n
0
0
0
0
i!
r,
0
,-")
'1
0
J
o
• i
0
0
(i
.:.;
r>
O
v/
0
•j
0
5
o
u
•'•-
C
o
J
o
U
1.1
0
•>
0
0
0
B
0
(.'
0
0
0
o
'\
0
0
0
o
0
i/
0
0
0
(i
f.
"\\ l
0
;.t
0
0
n
0
^
<)
U
o
"1
'.)
0
< t
v/
0
;;
n
o
0
..'
•J
:)
0
0
•J
0
0
0
,•)
0
:'l
0
C
0
0
0
0
0
n
0
o
0
0
1
1
o
0
0
0
ft
o
i.l
C
o
r>
0
ri
0
o
0
0
0
t 1
0
0
n
0
r-.
0
o
0
0
0
ij
n
0
0
0
' 0
0
r>
0
(i
0
TABLE:E-27 RECOVERY INDICATORS..FOR COMPUTER C
'389 5
7
o <cT<^ <^
T I M E
.000
.020
.0*0
.060
.080
.100
. I 2 0
.1*0
.160
.180
.200
.220
.2*0
. ?. b 0
.280
.300
.320
.3*0
.360
.360
. *. 0 Q '
.*?#
.**0
. * 8 0
. 5 0 ^
.520
.5*0
.560
.580
, fryQ
.620
.6*0
.660
.660
.700
.720
.7^0
.760
.780
.800
.820
.6*0
. BhO
.880
.900
• 9?0
,Q*0~
,960
.980
1.000"
A
0
0
0
0
0
0
0
0
0
0)
•)
0
•;)
0
0
0
0
0
o
h-
1
i 1
1
1
1
1
1
0
M
0
0
0
o
0
o
0
0
0
0
0
0
0
0
0
0
0
0
B
.">
0
0
0
0
11
1
1
I
i
0
0
0
0
Vf
0
M
0
o
-
0
A
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
y
0
0
0
0
0
0
0
c
o
(.1
0
0
.)
0
0
0
0
0
V
0
1
1
L
1
1
1
0
0
—
0
0
Q
0
0
0
0
0
0
0
.1
i
1
1
1
1
"0
0
0
0
0
0
o
0
0
0
A B
0 f)
0 1
0 I
0 1
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
o y
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
6 " o
0 0
0 0
0 0
0 0
0 0
0 0'
0 0
0 0
0 0
0 0
0 0
0"0"
0 0
0 0
"0 0
c
0
1.
1
3.
1 .
1
1
1
1
1
1
1
0
l)
0
0
0
0
0
0
0
0
0
0
01
1
3.
1
1
0
0
0
Q
0
0(j
0
0
0
o
0
"0 "~
0
0
0
.11, COUNT"
A -
0
0
0
0
0
0
-0
0
0
0
0
o
0
0
0.
0
0
0
0
0
1
2
3
*
5
6
0
0
0
0
.0
0
o
0
0
0
0
0
0
0
0
0
6
0
0
0
.B
Q
0
0
0
0
1
2
3
<f
3
6
0
0
0
0
0
0
a
0
0
0
,•)
0
o
0
)
0
0
0
0
0
0
:)
0
• 0
0
0
0
0
0
0
0
o
0
0
0 "
c
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
*
5
6
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
*
5
6
0
0
0
0
0
0
' 0
0
c
0
T2 COUNT
A
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
l"
I
3
*
5
6
0
0
0
0
'0
0
Q
0
0
0
0
0
0
0
0
0
6
0
0
0
B
0
0
0
u
0
1
2
3
^5
6
o
0
3
0
0
0
y
• 0
0
""o
0
0
0
0
0
.)
0
0
0
0
0
0
0
0
0
0
0
0
0
•J
0.
0
0
0
0
c
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
*
5
6
0
0
"
„,
"o"
0
0
o
0
0
0
0
0
0
1
2
3
*
5
6
0
0
0
0
0
0
0
0
0
0
1390
TABLE:E-2& Do NOT USE/ PERMANENT FLAGS-'AND- COUNTERS
... . • - FOR A/ B, AND C.SIGNALS. FOR COMPUTER A
A is off
Tl-COUNT . T2 COUNT
T I M E
,000
.020
.040
.060
.080
.100
.120
.140
.160
.180
.200
.2?0
.240
.260
. Z f f j
.300
,3?0
. ?<«0
.360
. 3BO
.400
.420
.440
.460
.480
. i > r t i i
.520
.540
.'560
• 5 f ^>
. f - O f J
,620
.640
.660
.660
.700
.720
.740
.760
.780
.600
.820
.840
.860
.860
.900
.920
.940
. 9 6 •'/
.960
1.000
A
-.>
0
0
0
.*»
o
0
0
o
0
0
0
0
f)
f}
u
0
o
0
0
0
:,1
o
0
i.)
I
1
1
1
1
1
•J
0
0
0
0
)
0
o
0
A
0
0
0
0
0
0
0
0
0
0
B
, >
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
o
u
0
0
0
0
0
u
0
0
0
0
o
0
0
0
u
0
.)
0
0
0
0
0
0
0
0
0
0
0
0
0
0
c
0
0
ft
0
u
0
'•J
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
o
0
0
0
0
«J
c
0
:.j
0
0
0
0
I
1
1
i
1
i
0
0
0
0
0
0
0
0
0
0
A
^0
0
0
0
0
c
0
0
0
[I
0
0
u
u
0
0
0
0
0
1
1
1
1
1
0
0
0
o
0
0
'!
0
0
f\
0
0
n
0
0
0
ft
0
0
0
0
0
0
0
0
0
B
0
0
0
0
0
0
0
0
0
0
0
o
0
u
f)
0
0
0
0
)
0
0
0
o
o
0
0
0
0
0
:j
o
0
0
0
0
•J
0
u
0
0
0
a
0
0
0
0
0
f)
0
0
c
0
0
0
0
0
1
1
1
i
1
1
A
0
0
0
0
0
0
0
u
0
0
0
U .
0
V
0
0
0
a
L
•1
,t
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
A
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0'
0
0
0
0
0
0
1
A
2
3
4
. 5
t>
c
0
0
0
0
0
0
0
0
0
0
0
0
c
0
0
0
0
0
0
B
0
0
0
0j
I
2
3
4
5
6
)
0
o
1)
0
0
'0
0
3
0
0
0
0
0
0
0
;)
g
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
u
0
0
0
0
0
0
c
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
4
c
6
0
0
0
0
0
0
0
0
0
0
A
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
y
0
0
0
0
01
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
0
B
0
o
0
0
0
1
2
3
4
5
6
0
0
)
0
0
0
0
0
,3
0
o
j
0
J
,1
o
0
0
. 0
,)
0
0
0
a
0
0
0
0
Q
0
0
3
0
0
0
0
'J
0
0
0
c
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
0
0
c
0
0
0
0
1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
TABLES E-29 Do-NOT USE/ PERMANENT FLAGS .AND COUNTERS
FOR A/ B, AND C SIGNALS FOR COMPUTER B
1391
T I M E
.000
.020
.0*0
.060
.060
.1.00
.120
.1*0
.160
.180
.200
.220
. 2<« )
.260
.280
.300
.320
.3*0
. ? 6 0
. ?8U
,*00
.420
. **0
.*60
.*eo
.500
.520
.54y
.560
. •> a o
.620
.6*0
.660
.680
.700
.720
.7*0
.760
.780
.ROD
.82 U
. ^*0
.860
.880
.900
.920
" .9*0
.960
.980
1.000
A
n
D
0
0
0
0
o
6
3
0
0
;.»
0
0
0
0
0
0
T
0
0
0
0
0
0
1
1
1
1
i
f-
j
LO
0
0
0
0
0
0
;,)
o
0
0
0
0
•)
0
0
0°
B.
0
0
0
0
0
0
f)
0
0
0
1
0
0
0
0
0
0
0
{';
0
0
0
0
0
0
0
0
0
0
i>
0
0
0
0
0
0
0
0
0
0
0
0
"0
0
0
0
<?.
c
U
0
0
0
0
0
0
U
0
0
0
'•)
I
1
1
11
1
0
0
0
n
0
0
0
0
0
0
0
U
1
1
I
i1
1
0
0
0
0
0
0
0
0
0
0
A
0
0
0
0-
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
011
1
1
1
0
0
0
0
6
0
0
0
0
0
0
0
0
0
0
0
0
6
0
0
n
B
0
0
0
0
0
0
0
0
0
0
0
0
v>
0
0
0
0)
0
0
0
i)
0
0
0
0
0
Q
0
i)
0
0
0
d
0
0
0
0
0
0
0
0
0
0
0
0"
cf)
0
0
0
0
0
3
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Jl CONUT
A
0
0
0
0
0
C)
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Q
0
1
2
3
*
i?
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
B
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
I)
0
0
0
0
0
0
0
0
0
0
0
0
0
y
0
0
0
U
0
U
"0
0
0
0
0
0
' 0
0
0
0
c
0
0
0
0
0
0
0
0
0
.0
0
01
2
3
*
5
6
0
0
0
0
0
0
0
0
0
0 .
0
0
1
2
3
*
5
6
0
0
0
0
0
0
0"
0
0
0
T2 COUNT
A
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
01
2
3
4
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
B
o
0
0
0
0
J
0
0
•3
0
6
0
0
0
0
0
0
.)
0
• 0
0
0
0
3
0
0
3
0
0
0
0j
0
0
0
a
0
')
J
0
•)
0
0
0
a
0
c.
0
0
0
0
0
0
0
0
0
0
0
U
1
2
3
4
s
b
0
0
0
0
0
D
0
0
0
0
0
0
^
I1
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0 "
TABLE.: E-30 Do NOT USE/ PERMANENT FLAGS AND COUNTERS
• FOR A/ B/ AND C SLGNALS FOR COMPUTER C
C is off
392
• TIME
.OI'-D
,G2v'
.040
.060
.OPO
.100
.120
.140
.160
.ieo
.200
.220
.2^0
.260
.280
. 300
.320
. 34U
.360
.380
.400
.420
.460
.480
.
r/00
.520
.540
. 5 6 rt
.580
.600
.620
.640
.660
.690
.700
.720
.740
.760
.780
.800
.820
.840
.860
.880
.900
.920
.940 "
.960
.Q80
1.000
SENSOR A
.0000
.1253
.2487 •
.3681
-,4eie
.5878
.6845
.7705
.84^3
.9048
.9511
.9 8? 3
.99*0
.9980
.9823
.9511
.9048
.844^
. 7 70 5
.684S
~-.oooo
^.1253
-.2487
-.3681
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.99SO
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.68V5
- . 5 87 8
-.4818
-.3681
-.2487
-.1253
.000^ >
SELECTED
SIGNAL
.OC'OO
.1253
.2487
.3661
.4816
.5878
.6845
.7705
.8443
.9048
.9511
.9823
.9980
.9980
.9823
.9511
.904.8
.8443
.7705
.6845
_
-.0000
-.1253
-.2487
-.3681
-.481«
-.5878
-.6845
-.7705
-.8443
-.9U48
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
" -.3681
-.2487
-.1253
.0000
INPUT
TO FILTER
.0000
.1253
.2487
.3681
.4818
.5878
.6845
.7705
.8443
.9048
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443
.7705
.6845
" " " ,
—
-.0000
-.1253
-.2487
-.3681
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
.0000
OUTPUT OF
FILTERS
.0000
.0456
.1696
.3068
.4199
.5266
.6319
.7241
.8047
.8728
.9271
.9666
. 9912
1.0000
.9931
.9704
.9325
.8799
.6133
.7340
— ~ — — ^
.0701
-.0555
-.1802
-.3021
-.4192
-.5297
-.6318
-.7240
-.8047
-.8728
-.9271
-.9668
-.9912
-1.0000
-.9931
-.9704
-.9325
-.8799
-.8133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
A is off
f,TABLE: E-31 INPUT AND OUTPUT FOR.,COMPUTER A
393
TIME
.000
.oz^
.040
.060
.080
.103
.120
.140
.160
.1^ 0
.200
,2?.0
.2*0
.260
.280
.300
.320
.340
.360
.380
.400
.420
.440
.460
.480
.500
.520
.540
.560
. 5 P 0
.600
.620
.640
.660
.680
.7CU
.720
.740
.760
.780
.800
.820 "
.840
.860
.880
.900
.920
".940"
.960
.980
1.000
SENSOR B
.0000
.0000
'.0000
.ouoo
.0000
.5878
.6845
.7705
.84^3 -
.9049
.-9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443"
.7705
.6845
.5878
.4818 .
.3681
.2487 -
.1253
-.0000
-.1253
-.2487
-.3681
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9983
-.9980 ""
-.9823
-.9511
-.9048 "
-.8443
-.7705
-.6845
-.5878
-.4818
" -73681"
-.2487
-.1.253
.0000
SELECTED
SIGNAL
.0000
. 0000
.0000
. 0000
.0000 -
.5878
.6845
.7705
.8^43
.9048
.9511
.9823
.998U
.9980
..9823
.9511
.9048
.8443
.7705
.6845
.5878
.4818
.3681
.2487
.1253
-.0000
-.1253
-.2487
-.?681
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
.0000
.INPUT TO
FILTER
.0000
.0000
.0000
.0000
.0000
.5878
.6845
.7705
.8443
.9048
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443
.7705
.6845
.5678
.4818
.3681 .
.2487
.1253
-.0000
-.1253
-.2487
-.3681
-.4818
-.5878
-.6845
-.7705
-.8443
-.9048
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
."0000
OUTPUT
FILTER
.0000
.0000
.0000
.0000
.0000
.5286
.6319
.7241
.8047
.8728
.9271
.9668
.9912
1.^ 000
.9931
.9704
.9325
, .8799
.8133
.7340
.6431
.5420
.4324
.3160
.1946
.0701
-.0555
-.1802
-.3021
-.4192
-.5297
-.6318
-.7240
-.8047
-.8728
-.9271
-.9668
-.9912
-1.0000
-.9931
-.9704
-.9325
-.8799
-.8133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
OF
TABLE: E-32 INPUT AND OUTPUT.,FOR,COMPUTER B
394
TIME
.000
.Q2s^
.040
.060
.080
~.10tf
.1.2.)
.140
.160
.180
. 200
.220
.240
.2ft«
.280
.300
.320
.3^0
.360
.380
.400
.420
.440
.''60
.480
.500
.520
.543
.560.
.580
.600
.620 j
.640" j
.660 I
.680 ;
. 7GO
.720
.740
.760
.780
.8 CO
.820"
.840
.860 .
.680
.900
.920
".940
.960
.980
1.000"
SENSOR C
. 0000
.OflOJ
."0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443
.770?
.6845
.5878
.4818
.3681
.2487
.1253
-.0000
-.1253
-.24*7
-.3681
-.4818
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
"-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.'3681"
-.2487
-.1253
.0000
SELECTED
SIGNAL
.0000
.0000
.0000
.0000
.0000
.0000
.0000
. 0000
.0000
.0000
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443
.7705
.6845
.5878 -
.4818
.3681
..2487
.1253
-.0000
-.1253
-.2487
-.3681
-.4818
-.9511
-.9823
-.9980
.-.9980
-.9823
-.9511
-.9048
-.6443
-.7705
-.6845
-.5878 '
-.4818
-.3681
-.2487
-.1253
.0000 "
INPUT TO
FILTER
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.9511
.9823
.9980
.9980
.9823
.9511
.9048
.8443
.7705
.6845
.5878
.4818
.3681
.2487
.1253
-.0000
-.1253
-.2487
-.3681
-.4818
-.9511
-.9823
-.9980
-.9980
-.9823
-.9511
-.9048
-.8443
-.7705
-.6845
-.5878
-.4818
-.3681
-.2487
-.1253
. 0000
"OUTPUT i
FILTER
.0000
.0000
.0000
.0000
.0000
.0000
,0000
.0000
.0000
.0000
.9271
.9668
.9912
1.0000
.9931
.9704
.9325
C8799
.8133
.7340
.6431
.5420
.4324
.3160
.1946
.0701
-.0555
-.1802
-.3021
-.4192
-.9271
-.9668
-.9912
-1.0000
-.9931
-.9704
-.9323
-.8799
-.8133
-.7340
-.6431
-.5420
-.4324
-.3160
-.1946
-.0701
OF
C is off
TABLE:E-33 INPUT AND OUTPUT FOR COMPUTER C
APPENDIX F
CARSRA, A RELIABILITY ESTIMATION TOOL FOR REDUNDANT SYSTEMS
1. 0 PROGRAM DESCRIPTION
1. 1 Introduction
CARSRA (Computer Aifed Redundant System Reliability Analysis) is a FORTRAN program
which is designed to facilitate the reliability assessment task for fault tolerant recon-
figurable systems. It is capable of taking into account influences from transient faults
and will model a wide range of redundancy management strategies.
Many previously developed reliability estimation tools are based on success paths tabu-
lation. Examples are the TASRA program developed by Battelle Columbus Laboratories
(reference 1) and the ARMM program described in reference 2. One disadvantage of
the success path tabulation approach is the very large number of success paths in a
highly redundant system. Another perhaps more significant disadvantage with the above
mentioned programs is their inability to model transient fault and failure coverage effects
where the latter parameter reflects the probability of detecting, isolating and recover-
ing from a failure.
Recently, see for example references 3 and 4, Markov modeling has been utilized to
model the effect of failure coverage. The Markov model has the advantage of offering
high flexibility which makes it possible to take into account most factors of interest
including fault transients, failure coverage and the possibility of spares or maintenance
a c t i o n s . However, a basic disadvantage with the direct Markov model approach is
the very large number of Markov states required to model a system with (many)
internal signal consolidation (voting) nodes.
The most significant feature of the CARSRA program is the concept of partitioning the
system into smaller entities, each of which may be treated by a Markov model of
lower dimension. This preserves the Markov model flexibility, avoiding the problem
of an exorbitant number of Markov states.
Another significant feature of CARSRA is the ability to assess Functional Readiness
as well as system failure probability. The concept of Functional Readiness is of con-
siderable significance for a mission containing a critical subtask which will either
396
be performed, or not performed, depending on the operational redundancy level at
the time of demand. An example is an aircraft automatic landing function for which a
certain level of hardware redundancy is required before a landing may be initialized
in poor visibility weather conditions.
1.2 Definition of Terms
A number of special terms are defined which will be used in the following.
A module is the smallest functional entity treated by the program. It has a Poisson
type failure distribution with an a priori know failure rate.
A stage is a set of identical redundant modules. Voting or Signal Consolidation is often
performed on the output signals from the modules in a stage. There are however stages
without output voting. TMR stands for Triple Modular Redundancy and is associated with
majority voting on the module outputs of a stage such that at least two out of three mod-
ules have to operate for the stage to survive.
A channel is a particular minimal set of modules capable of performing the system
function in a non-redundant configuration.
1.3 The CARSRA Approach
The system is conceptually partitioned into stages, for example each sensor type in
a flight controls system will constitute a stage as will the processors and each servo function.
The operational status of each stage is modeled by a finite order Markov process in
which each state corresponds to a particular redundancy state. An example is shown
in Figure Fl.
The transition rates Aij in the figure iare assumed constant, i. e., not time varying
and CARSRA will handle up to ten states per stage.
The operational status of a^module in a particular stage may or may not depend on
modules in other stages being operational. A module, which when failed will cause
loss of function of another module the same channel but in a different stage, will be called
a "dependency" module and the corresponding stage a "dependency stage".
397
a
:
ao1U
l
o<COU
.
o
<Xuia:rao
a
:
U
l
Q
Ul0£r>_i»-^<u.Ulzo
coUl(£
.
ID_J»—
 i
<cu.oI^—
COUla:3_i*-^<u.UlUla:
CM
OLUl
U
l
 
U
l
Z
 
O
a
:
 
<
ui
 
ui
a
.
 
_i
II
 
II
xui
398
1.3 (cont'd)
Thus, there are two types of stages: dependency and non-dependency stages. Some
stages may be both dependency and non-dependency stages. The dependency structure of
a system may be described by a dependency tree diagram, an example of which is dis-
played in Figure F2 for a flight controls system. In this system the processor/memory
stage is a dependency stage, the MPX and A/D stage are both dependency and a non-
dependency stages; and each sensor stage a non-dependency stage.
The lines connecting the different stages indicate the intra-channel dependency structure
in the sense that a failure in, for example, the processor channel A will cause loss of
function of the channel A sensor and servo modules. The digits at the upper right hand
corner of each block indicate the number of redundant modules in each stage.
The circles, to the right in the figure,represent-functional elements needed for system
survival. A loss of a combination of these functional elements will cause system failure.
This combination may be specified by a particular entry to the program, a feature which
will be further described shortly.
It is important to note that functional elements, which are represented by the circles, in
the CARSRA model are always outputs of non-dependency stages. In a flight control sys-
tem, a possible (but not unique) set of functional elements are the various sensor data,
the processing function of the computer, and the actuation of the control surfaces.
CARSRA treats dependency between stages by an approach which may be denoted "exhaus-
tive conditioning". The essence of this approach is to make the non-dependency stages
independent via conditioning upon the failure status of the dependency stages.
This approach is most easily explained by presenting a simple example. Consider the
triplex system outlined in Figure F3 consisting of two sensor stages A and B, a multi-
plex stage C and a computer stage D. The sensor signals are multiplexed and crosstrapped
into the computers where signal selection (voting) and failure detection is performed in
software.
TMR operation is assumed which implied that two out of three signals are required at
each voting node. For simplicity the assumption is made that the output voter has
zero failure rate.
399
777
FIGURE F2 FLIGHT CONTROL SYSTEM DEPENDENCY TREE
PROC,
MEMORY
MPX &
A/D
R/A
ILS
YAW
RATE
LAT
ACC
NORM
ACC
COMPASS
COUPLER
ROLL
SERVO
PITCH
SERVO
YAW
SERVO
O
O
O
O
O
O
-O
O
400
1.3 (cont'd)
O
Note that the sensor stages and the multiplex stage are mutually dependent in the
sense that a multiplex failure will cause a module failure both in sensor stage A and
sensor stage B. The dependency is unidirectional since a sensor failure will not
prevent the multiplex module from acquiring data from a sensor in another channel.
Figure F4 displays the corresponding dependency tree.
B
O
D O
ELEMENT
F4gui?e F4 V [.,.
401
113 (cont'd)
To explain the approach, the following theorem will be needed:
n
Let E. i =1, 2, ... n be disjoint events with £ P(E.) = 1.
Let F be an arbitrary event. Then,
n
P(F) - E P(F | E.) • P(E.)
The success probability for the system of Figure FS.^ynow be found by defining the
events E. as follows:i
E = no multiplex module failed
E = one multiplex module failed
E = two multiplex modules failed
o
E . = all multiplex modules failed
The probability of system success may, according to the above theorem, be expanded:
4
P(S) = P(S E ) • P(E )
The advantage of this representation is that the probabilities P(S | E.) usually are
easier to find than finding P(S) directly. For example:
P(S) E ) = P 1 (Stage A Survives) and (Stage B Survives) and (Stage D Survives) I
becomes with R = module reliability and Q = 1-R:
P(S
hern
P(E]
V
lore
L> =
._ /"D
— 1 XX
'-5
+ 3R2. Q,A /
3 2 3
T^ Tl ^^Tt T^ * 3R^ QD)
402
1.3 (cont'd)
) The next term in the expression contains the factor P(S E ), i. e. , the probability of
system, survival given that one multiplex module is failed. In this case both of the
remaining modules in sensor stages A and B have to survive for system survival:
P(SE 2 ) =RA '
P(E2) = 3RC2 Qc
Finally P(S E ) = P(S
I O
= 0
Summarizing, the system reliability becomes:
P(S) = (R 3
 + 3R I QA) <R| + 3R| QB) (R 3 + 3R ^  ^ R
+ RA ' R| <RD + 3RD V ' 3RC2 QC;
Note that this expression differs from what is obtained if the stages are assumed
. ^independent:
'4 "'• •
P<SIND) = (R! + 3R1^ (RB + 3R| QB} <RC + 3RC + 3RC QC) '
I" ' <RD + 3RD S)> ;
As was mentioned above, the outputs from the non-dependency stages constitute the
functional elements required for systems survival. There however, situations
w h e r e these f u n c t i o n s t h e m s e l v e s m a y b e r e d u n d a n t , f o r
example the redundancy between aileron and spoiler control surfaces of certain air-
crafts. CARSRA will model this situation by accepting a success event tabulation cover-
ing all, functional element combinations equivalent to system success.
Summarizing, the computation is performed in three different steps: Markov model-
ing for each stage, treating dependencies between stages via exhaustive conditioning
and specifying the functions needed for success by success configuration tabulation.
1. 4 The Functional Readiness Feature
~ Fault tolerant systems will continue to perform their functions even after experienc-
r )
---' ing one or several module-failures. With this basic feature it will be of interest to
be able to assess the probability of experiencing a certain redundancy degradation
within a prescribed time interval and furthermore to be able to assess the system
403
/A'' >
1.4 (cont'd)
failure probability given that a certain degradation has taken place. This informa-
tion could be used to establish Functional Readiness Criteria, i. e., the system failure
states that will not cause deferment of a particularly critical phase of a mission, for ex-
ample the landing of a manned spacecraft on the moon or the previously mentioned case
of automatic landing of a passenger transport in low visibility weather conditions.
CARSRA accepts a selected Functional Readiness Criterion specifying the combination
of modules which could be failed and computes the probability of having any of these
modules failed as a function of time. It also computes the system failure given a
Functional Readiness Criterion as a function of time in separately specified time frame.
Two different system failure modes may be specified, e. g., detected or undetected
system failure.
The following relations are used. Let PFR(t1), i=l, 2, ... N denote the probabilities
associated with the different specified Functional Readiness configurations at a time t^
and let PFPj (t2> be the conditional probability of a certain failure mode at an exposure
time t2 given Functional Readiness configuration i :
CARSRA then computes:
PFR = Functional Readiness = £- PFRj (t^
C'N -| -iPFP = Failure Probability = £ PFPj (t2) x PFRt (tj_) • PFR
In addition to the above mentioned features, two levels of computational accuracy may
be specified. The resulting computational roundoff errors are indicated in the computer
printout.
1.5 CARSRA Program Structure
In this section, the structure of the program will be described in enough detail to
provide a basic understanding of the operation.
The program is designed using a top-down approach with each subtask carried out in a
separate subroutine. The overall structure is shown in Figure F5.
MAIN is the main program and each of the other blocks represents subroutines with a
higher order subroutine calling a lower order routine as indicated by the lines connect-
ing the different blocks. The subroutine FAILPR will, for example, call the subroutine
SETETY, STRIP, EQUAL, PRINTZ, INDFP and DEPFP.
The operation of each of the subroutines and their interplay will next be described.
a
.
UJ
a
o
a
.
u
.
a
oz•—iCO
a
.
_l•—i<
a:a.
UJh-oo
OQOo:
oL
ULJUJ
UJDcchrO3o:H
co
g0
0
Q
i
<c
o:i-co
orID
a:o
uia:
1.5.1 Subroutine Descriptions and Flow Diagrams
MAIN: The MAIN program directs the data input, the computation, and the data
printout. It calls the subroutine READIN, INITYZ, COMPUTE and <a.UpPUT. As
part of the input data from READIN, the MAIN program gets the specified Functional
Readiness time interval and time increment which it uses to set up a loop which com-
putes Functional Readiness and system Failure Probability data and outputs this in-
formation for each Functional Readiness time increment. The MAIN flow diagram
is shown in Figure F6.
READIN: Reads input data from a punched card file specifying Markov model transi-
tion rates, desired Functional Readiness time interval and increment, desired Failure
Probability time interval and increment; Functional Readiness Criteria, the Success
Configuration and the desired computational accuracy. The subroutine flow diagram
is indicated in Figure F7.
INITYZ: Initializes the computation by computing the transition rates
It calls the subroutine FORMT which computes a matrix T and its inverse TINV
used by the program to solve the Markov model equations. (The mathematical details
may
the
(be found
indicator
Configuration,
modules. '
failed
jat^ the|jend]
array 1
and tt
Also,
ffi
is shown
stage
the
anc
NDIC
of this
which
le array
array
the
appendix.)
H
INTMP
MAP
number
is
|of
in Figure F8.
used M
which (fe
The subroutine
indicate
used
constructed
the
,
the
M
INITYZ
Functional
map the
indicating the i r
corresponding [Markov
also initializes
leadiness
dependency
elation
state.
1
between
between
The flow
modules
diagram1
-
FORMT (I)
 :
Computes the matrices T and TINV for a particular stage (see solution at the end of this appendix).
; I indicates the stage number. The flow diagram is displayed in Figure F9.
COMPUTE (AVT. AVBTY):
Computes the detected and undetected failure (or any other two selected failure mode)
probabilities for a certain Functional Readiness time and criterion. The failure prob-
abilities are stored in common arrays FAILP and FALUN where FAILP is the
406
 '
MAIN
FUNCTIONAL
READINESS
TIME LOOP
C ENTER
CALL READIN
i
CALL INITYZ
f
SET UP FUNCTIONAL
READINESS TIME LOOP
CALL COMPUTE
CALL OUTPUT
C
i
STOP
FIGURE F6 MAIN PROGRAM FLOW DIAGRAM
'407
READ IN
f ENTER J
^
I=I,(NIS+NDS)
READ NO OF DEPENDENCY STAGES NIS
READ NO OF NON-DEP, STAGES NDS
1
READ STAGE NO/ MARKOV DIMENSION,
NUMBER OF MODULES AND TRANSITION
RATES LMDA(I,J/K) FOR STAGE i
1
READ FUNCTIONAL READINESS
AND FAILURE PROBAB, TIME
ENTRIES
I
1=1 NARY
I READ NO. OF ENTRIES IN PEP. ARRAY, NARYi
READ DEPENDENCY ARRAYS
N£ND(l) AND NDEPU/J)
J JL / ill -L.7
,
I=l/ NAV
I READ NO. OF FUNC. READ, CONF, NAV|
READ THE FAILED MODULES, NA(l,K)
K=l,2,5
1=1, NOSCOF READ NO, OF SUCCESS CONF/ NOSCOF~
READ SUCCESS CONF ICOF(lVj)
1
READ ACCURACY INDICATOR, NACCUR
( EX1T )
408
FIGURE F4 READIN FLOW DIAGRAM
INITYZ
FOR DEPENDENCY STATES
COMPUTE
CALL FORMT'
I=1/NDS
FOR NON-DEP, STAGES
COMPUTE x^
1
CALL FORMT(I)
i
SET IKDIC AND INDTMP
EQUAL TO ZERO
i
[ INITIALIZE MAP
'
FIGURE F» INITYZ FLOW DIAGRAM
FORMT
C ENTER J
INITIALIZE T
i
SET THE DIAGONAL ELEMENTS
TU,J,j)=l
YES
NO
RECURSIVELY COMPUTE
THE T-MATRIX COMPONENTS
INVERT THE T-MATRIX
C EXIT
SET T AND TINV
EQUAL TO THE IDENTITY
MATRIX
FIGURE F9 FORMT FLOW DIAGRAM
410
1.5.1 (cont'd)
probability of being in any of the two last Markov states in any stage and FALUN is
the probability of being in the last Markov state in any stage. The array SPY which
contains the computational truncation error estimates is also computed. The range
of the above arrays are specified by the failure probability time range FPMT and
the increment FPDT read by INITYZ, .with the i element in each array correspond-
ing to a failure probability time T = i • FPDT. The Functional Readiness time is
transferred from MAIN to COMPUTE in the argument AVT. COMPUTE calls the
subroutine AVAIL to compute the Functional Readiness probability which is trans-
ferred to MAIN in the argument ABVTY. The flow diagram is shown in Figure FlO.
OUTPUT (AVT, AVBTY):
I
Outputs the arrays FAILP, FALUN and SPY together with the Functional Readiness
(AVBTY) for a particular Functional Readiness time (AVT).
AVAIL (PRO, I, TIME):
Computes the Functional Readiness, PRO, at time TIME for the Functional Readiness
configuration specified by entry number I in the Functional Readiness table which is
read by READIN and stored in the common array NA (I, K). The AVAIL subroutine
also sets the indicator array INDIC corresponding to the Functional Readiness Con-
figuration and, in the case the Functional Readiness configuration specifies a dependency
module failure, rearranges the structure of the common arrays NIND and NDEP which
specifies module dependencies. The corresponding rearranged arrays are NTIND
and NTDEP.
The subroutine AVAIL uses the subroutine PROB to compute Functional Readiness and
SETETY to set the indicator array INDIC. The flow diagram is shown in Figure Ell.
411
COMPUTE
FUNCTIONAL
READINESS
LOOP
I=1/NAV
C ENTER J
INITIALIZE
CALL AVAIL
I
CALL FAILPR |
±
ACCUMULATE SYSTEM
FAILURE PROBABILITIES
i
NORMALIZE DATA
C EXIT
FIGURE F1Q COMPUTE FLOW DIAGRAM
AVAIL
J=J+1
C ENTER J
i~^
INITIALIZE
READ FUNCT, READINESS MODULE
FAILURE INDICATOR NA(lN/j)
REARRANGE
DEPENDENCY ARRAY
SET THE INDICATOR
ARRAY INDIC
COMPUTE FUNCTIONAL
READINESS
( EX1T )
FIGURE Fll SUBROUTINE AVAIL FLOW DIAGRAM
413
1.5.1 (cont'd)
PROB (ISTAGE. IENTRY, IEXIT, P. TIME):
This subroutine computes for stage ISTAGE, the probability P of being in Markov state
IEXIT at time TIME given state IENTRY at time zero. The subroutine uses the pre-
viously, in FORMT, calculated matrices T and TINV.
SETETY (J):
The subroutine sets the failure condition array INDTMP according to the module failure
pattern specified by entry number J in the dependency table which specified the system
dependency structure. The flow diagram is shown in Figure.-F12.
STRIP (INPUT, NSTGE, NSTATE):
Finds the stage number XX and state Y from a three digit number INPUT = XXY.
FAILPR (FALP, FALND. SP):
This subroutine computes the failure probability arrays FAILP and FALUN for a certain
Functional Readiness configuration specified by the status of the array INDIC which
was set in AVAIL. It uses the (rearranged) dependency arrays NTIND (I) scanning
through all entries I and failing combinations of dependency modules NTIND (I)
which in turn causes non-dependency modules NTDEP (I, J), J = 1, 2, ... to fail.
This is the actual implementation of the above described "exhaustive conditioning" a
approach. The probability of each combination of dependency module failure states is
computed by calling the subroutine INDFP, and the conditional probability of the non-
dependency modules failing, given the particular combination of dependency module
*
failures, is computed by calling DEPFP. The system failure probability is then
computed by multiplying these two probabilities and summing over the different
dependency module failure combinations. To avoid calling the subroutine PROB
repeatedly all needed transition probabilities are computed initially by calling PRINTZ
•
which stores those probabilities in the array PROBAB which is common for INDFP
and DEPFP.
Combinations of up to two different dependency module failures are considered if
the accuracy indicator NACCUR entered by READIN is set to zero. With NACCUR = l,
combinations of up to three dependency module failures will be considered.
SETETY(J)
FAIL THE DEPENDENCY
MODULE NTIND(J)
BY SETTING INDTMP= 2
FOR THIS MODULE
SCAN THROUGH
DEPEtfENCY ARRAY
TO FIND AN ENTRY FOR
WHICH LOPINDU)=1
FAIL A NON-DEPENDENCY MODULE
NDEPU,L)
SET INDTIMP FOR THIS 10DULE=1
FIND THE CORRESPONDING
DEPENDENCY TABLE ENTRY, M
FIGURE F12 SUBROUTINE SETETY FLOW
415
1.5.1 (cont'd)
The truncation error, caused by not covering^ofLppssible combinations~oftdep:endeneyy
module failures, is the difference between unity and the sum of the probabilities of
all considered dependency module combinations. A flow diagram over the subroutine
FAILPR is outlined in Figure .F13.
PRINTZ (TIME. NIS, NDS, DIM):
Computes Markov transition probabilities from state K to state J with J -K for all
system stages and stores the result in the array PROBAB.
EQUAL (A,B):
Equalizes the two dimensional array A with the two dimensional'array B.
INDFP (PR, FP, FPU, K):
Computes the probability PR of a certain dependency module failure pattern specified
by the state of the array INDTMP. The possibility that this specified combination
leads to system failure is also computed and the two failure mode probabilities aVe
stored in FP and FPU. The flow diagram is in Figure F14.
BINQ (M, N, K):
Computes the binomial coefficient K = ( J
DEPFP (PFAIL, PUNDET):
Computes the conditional probabilities of two system failure modes given a particular
dependency module failure state which causes failure of non-dependency modules as
specified by the status of the array INDTMP. It also scans through the stage success
table and accumulates success probabilities over all combinations of functional elements
equivalent to system success. The flow diagram may be found in Figure F15.
416
C ENTER J
^
TIME LOOP
SET UP FAIL,
PROS, TIME LOOP
CALL PRINTZ
INDTMP=INDIC
I
COMPUTE FAIL, PROB, GIVEN
NO DEPENDENCY MOD, FAILURE
ONE FAILURE
LOOP
FAIL, ONE
DEP, MODULE
COMPUTE AND ACCUMULATE
FAIL PROB,
TWO FAILURE
LOOP
FAIL A SECOND
DEPENDENCY MODULE
i
COMPUTE AND ACCUMULATE
FAILURE. PROB,
THREE FAILURE
LOOP
FAIL A THIRD
DEP, MODULE
i
COMPUTE AND ACCUMULATE
FAIL, PROB,
FIGURE F1.3
FLOW DIAGRAM FOR
SUBROUTINE FAILPR
417
r
INDFP
C ENTER
1=1,NIS
INITIALIZE
VARIABLES
FIND THE NUMBER OF INDIRECTLY
FAILED MODULES IN A STAGE BY
SUMMING INDTMP=i
I
FIND NUMBER OF DIRECTLY
FAILED MODULES IN A STAGE
BY SUMMING INDTMP=2
FIND MARKOV
EXIT STATES
ENTRY AND
COMPUTE THE TRANSITION
PROBABILITY FROM ENTRY
TO EXIT STATE/ P
COMPUTE THE FAILURE
PROBABILITIES FP AND FPU
C EX1T
FIGURE FIT INDFP FLOW DIAGRAM
DEPFP
I = lf NDS
C ENTER J
FIND THE NUMBER OF INDIRECTLY
FAILED MODULES IN A STAGE BY
SUMMING THE MODULES FOR WHICH
INDTMP=1
FIND THE MARKOV ENTRY STATE
COMPUTE FAILURE PROBABILITIES
FP AND FPU
ACCUMULATE SURVIVAL
PROBABILITIES
SCAN THE SUCCESS CONFIGURATIONS TO
ACCUMULATE SYSTEM SURVIVAL PROBABILITIES
1
COMPUTE FAILURE PROBABILITIES
i
EXIT FIGURE f!5 DEPFP FLOW DIAGRAM
419
2. 0 CARSRA - A USER'S GUIDE
2.1 General
The CARSRA program is coded in FORTRAN IV with approximately 700
FORTRAN statements. It requires 100, 000 core locations to run, and
the execution time varies with the complexity of the system to be analyzed
and the selected program option, the typical execution time being in the
range of 1-30 seconds.
2. 2 Input Description
All input data is on punched cards and must be input in a prescribed order.
The following comments are made in relation to Table Fl which I specifies
required input data, the corresponding variable names used by the program,
and the input format.
2.2.1 Dependency and Non-Dependency Stages
The system is partitioned into dependency and non-dependency stages with a
failure of a dependency stage module causing failure of a module in a dif-
ferent stage. In cases where a stage both is a dependency and a non-
dependency stage, it will, in the program input, be identified as a depen-
dency stage.
Dependency stages are assigned numbers in the range 1-20 consecutively
starting by 1. If failure of a certain dependency stage module causes another
dependency stage module to fail, the former stage should be assigned a
lower stage number than the latter. Dependency stage numbers should be
assigned in consecutive order without leaving a number unassigned inside
the array.
Non-dependency stages are assigned numbers in the range 21-50 consecu-
tively starting with 21. .
2.2.2 Stage Dimension, Numbers of Modules and Transition Rates
The assigned stage number (NST), the dimension (NDIM) and the number of
modules in the stage (MODN) information is entered on one card. The
dimension specifies the number of states in the Markov model for the stage
and is used to control the read in of the following (NDIM-1) cards which
2.2 (cont'd)
TABLE|F1: CARSRA Input Data
Input
Data
No. of non-dependency (NDS) and
dependency stages (NIS)
Stage dimension and No. of
modules for a certain stage,
NST, followed by
The corresponding transition
rate matrix in failures per
million hours.
Functional readiness and fail-
ure probability time entries
No. of dependency array entries
Dependency Structure
No. of Functional Readiness
C onf igur ations
Failed Modules
No. of Success Configurations
Success Configurations
Accuracy Indicator
Variable
Name
NDS, NIS
NST, NDIM,
MODN
LMDA (NST, K, J)
J = l, NDIM
K = l, (NDIM-1)
AMT, ADT, FPMT,
FPDT
NARY
NIND(I), NDEP(I.J)
1 = 1, NARY
J = l, 19
NAV
NA a, K)
1 = 1, NAV
K=l, 3
NOSCOF
ICOF (I, J)
I = NOSCOF
J = l, 50
NACCUR
Format
2110
3110
10(F8.2)
4(F10. 5)
110
20(14)
110
3(14)
110
5011
110
No. of
Cards
1
1
NDS+NIS
TIMES
NDIM-1
J
1
1
NARY
1
NAV
1
NOSCOF
1
421
2.2.2 (cont'd)
specify the transition rates LMDA (NST, K, J) in failures per million
hours. LMDA (NST, K, J) is the transition from state K to state J in
stage NST with LMDA (NST, K, J), J = 1, 2, ... NDIM on one card for
each K value. Only (NDIM-1) cards corresponding to K = 1, 2, ... (NDIM-1)
have to be entered for each stage since the last state always will have zero
transition rates. Only transitions from lower order to a higher order
states are permitted, i. e., LMDA (NST, K, J) must be equal to zero
(left blank) for K >J. This constraint implies that CARSRA as currently
coded is unable to handle modeling of equipment repair.
Markov state one always models no module failures, Markov state two
one module failure, and state three two module failures.
2.2.3 Functional Readiness and Failure Probability Time Entries
Transitional Readiness Time span (AMT) and time increment (ADT) are
entered on one card together with Failure Probability Time span (FPMT)
and time increment (FPDT). For each Functional Readiness Time equal to
I x ADT < AMT, 1=0, 1, 2 a table over the Failure Probabilities
as a function of Failure Probability Time of J x FPDT < FPM J = 1, 2, ...
is printed. If only the Failure Probabilities are of interest, enter AMT = 0,
ADT = 1.
2.2.4 Dependency Array.
The number of dependency modules in the system, (NARY), is entered on a
separate card, followed by NARY cards specifying the system dependency
configuration. Each dependency module NIND (I) (1 = 1, ... NARY) will
when failed cause failure of modules NDEP (I, J), J = 1, ... N with N < 19.
The modules are specified by NIND and NDEP in the form XXY with XX
being the stage number and Y the module number within the stage. (See
further the example below).
2.2.5 Functional Readiness Table
The number of Functional Readiness configuration entries NAV is specified
on a separate card followed by NAV cards, one for each configuration.
Each configuration is characterized by up to 3 failed modules NA (I, K)
K = 1, 2, 3 where the module is indicated by XXY as before (2.2. 4). The
Functional Readiness probability computed by the program is the probability
of having any one of the specified system failure patterns at a given time.
2.2.6 Stage Success Table
NOSCOF, entered on a separate card, specifies the number of stage failure
patterns equivalent to system success. One card is thereafter entered
for each pattern which is specified by a 1 (one) in a column corresponding to
a failed stage. If all stages are essential for system success,NOSCOF is
equal to one followed by a blank card. If two stages, for example the non-
dependency stages 21 and 22, are redundant three cards are required:
1) Col 21 =0 Col 22 = 0; 2) Col 21 = 1 Col 22 = 0; 3) Col 21 = 0 Col 22 = 1.
2.2.7 Accuracy Indicator
The accuracy indicator, NACCUR, specifies the level to which conditioning
upon the combinations of dependency module failure is performed. If NACCUR = 0
(lower accuracy setting) all combinations of zero, one and two dependency
module failures will be considered plus failure combinations equivalent to
system failure. If NACCUR = 1, up to three dependency module failures
will be considered plus failure combinations equivalent to system failure.
The higher accuracy could be required when treating a system with high
module redundancy (for example a quad system) or when the mission time is
long. However, the program run time could be in the order of ten times
longer for the higher accuracy setting.
The truncation error is indicated by an accuracy number. The correct failure
probability value will lie in the range
I printed output value, printed output value + accuracy 1
1423
y
2.3 Output Description
A table over system failure probabilities are printed for each Functional
Readiness Time I x ADT = AMT. The table entries are Failure Prob-
ability times J x FPDT < FPMT, the probabilities of a detected and an
undetected system failure (or any other two failure modes), and the trun-
cation error band.
2.4 An Example
The use of the CARSRA program will be illustrated by a simple example.
Consider .a system with dependency structure depicted in Figure F16.
3
/
V
Figure F16
There are two dependency stages, 1 and 2, and four non-dependency stages,
21-24. The numbers at the right hand upper corners of the stage blocks
pertain to the modular redundancy level.
Assume further that the stage Markov model is the one depicted in Figure F17
for the five triple redundant stages and like in Figure .F18 forfcthe.single
duplex stage.
424 !
0 FAILURE
12
STAGE FAILURE
DETECTED UNDETECTED
FIGURE F17 TRIPLEX STAGE MARKOV MODEL
0 FAILURE
1 FAILURE
STAGE FAILURE
DETECTED UNDETECTED
FIGURE F18DUPLEX STAGE MARKOV MODEL
1425
2.4 (cont'd)
The transition rates are given by Table [^2:
TABLE IF2. CARSRA Example Transition Rates/10 Hrs
Stage
1
2
21
22
23
24
*12
. 300
60
900
750
180
300
1^3
0
0
0
0
20
0
2^3
180
32
600
450
50
150
2^4
20
8
0
50
50
50
3^4
90
15
300
200
—
50
*35
10
5
0
50
—
50
Functional Readiness data is desired in 50-hour increments (ADT) up to
100 hours (AMT) for a Functional Readiness criterion corresponding to a
single module failure in any of the stages 21, 22 or 24.
Failure probabilities in 1-hour increments (FPDT) up to 5 hours (FPMT)
are to be generated.
Stages 21 and 22 are redundant in the sense that the system will survive
a loss of any one but not both of these stage functions.
The CARSRA input data deck is presented in Table F3.
The output is presented in Table)F4. The input data is first echoed
after which the computed reliability data is listed. The negative accuracy
band is caused by computational roundoff errors.
426
2.4 (cont'd)
TABLE |F3: CARSRA Input Example
C
O
M
 
M
.I
1
1
S T A T E M E N T
NUMBER
2 3 4 5
|
1t
2 3 4 5
C 
O
 
N
 
T
.
 
1
6
6
7 8 19 10 1 1 1 2 13 14 1516 ' 1 7 1 8 1 9 20 2 1 2 2 23 2 4 2 5 26 27 28 29 30 31 3 2 3 3 3 4 3 5 3 6 3 7 3 8 39 40-
f
i
2
&l
2z
23
i
\
3*
i
*^
t>0
O jy
'
75-
/ ;^^
D
.
'
.
.
.
,
1
3
t>
<f
4
f
8
6
2
5
0
5
r
*f
•
s
\
0
,
0
f?
i
t
•
j
0 .
~o •
•
*A
| /
3
A
0
5-
o
0
*
O.
f
r
0
o
p
1
.
.
.
^
*
«3
3
3
3
J.
•j
/ 0
Sr
,
^^J
O .
7 8 9 10 1 1 1 2 1 3 1 4 15 1 6 1 7 1 8 1 9 2 O 2 1 2 2 2 3 2 4 2 5 2 6 2 7 28 29 3 O 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3 8 3 9 4 O
427
2.4 (cont'd)
TABLE|F3: (cont'd)
J
o
1
1
STATEMENT
NUMBER
2 3 4 5
/
/
/
2
2
£i
/
2.
?
/
7
<>
/
Zl 2.
4\'3
JL\2.I
f
3.
A
2
*
i
3
f /
y.2
K3
2 3 4 5
C
O
N
T
.
 
|
6
/
2
2
t
6
7 8 '9 1 O 1 1 1 2 1 3 1 4 I 5 I 6 J 7 1 8 1 9 2 O 2 1 2 2 2 3 2 4 ^ 5 2 6 2 7 2 8 2 9 3 O 3 1 3 2
o'o
/
*
6
!jj
2 * 2 .
JL3
/!/
/*-
/ *
A
X
0
^
/^
3
3O 0
!
¥/
?
y-
z
3
z/
a.
2
2j
3
B
*•:?
jtJ
/
5
2
/
0
f
5 b
/
*
/
5"o
S
r.
ci
/
•o
•
*
3
i3 34 35 36 37 38 3940
•--
5- 0
y
.
»
''
7 8 9 10 1112 1 3 1 4 15 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3 8 3 9 4 0
DO 6000 1500 ORIG. 3/71
428
TABLE |F4: CARSRA Output Example
ECHO CHECK
NO OF DEP i F A G E S " << Nu OF IND S T A G E S * 2
S T A G E 1
-C.
-0.
-0.
-0.
300.
-0.
-0.
-0.
180.
-0.
-0.
5 NO CF
-0.
20.
90.
-0.
MiDULES 3
-0.
-0.
10.
-C.
STAGE
STAGE
STAGt
2
-0.
-0.
-0.
-0.
21
-0.
-0.
-0.
-0.
22
-0.
-0.
-0.
-0.
DIME Nil
tc.
-0.
-0.
DIMENSI
900.
-0.
-C.
-c.
DIMcNSI
750. -
-0.
-0.
ON
-J.
32.
•• -0.
-0.
ON'
-0.
600.
-0.
-0.
ON
-0.
-0.
-0.
5 NO OF
-0.
6.
15.
-0.
5 NJ OF
-0.
-0.
300.
-0.
5 NO OF
-0.
50.
200.
-0.
KiUDULES 3
-0.
-C.
-0.
MODULES 3
-0.
-0.
-0.
-C.
hODULfcS 3
-C.
-C.
50.
-0.
S T A G E 23
-0.
-0.
-0.
DIMENSI
180.
-0.
-0.
JN
20.
50.
-0.
4 NO OF
-0.
50.
-0.
MODULES 2
" •_
STAGE 24 DIMENSION 5 NJ OF MODULES 3
429
TABLE |F4: (cont'd)
-0.
-e.
-0.
-o.
JOP.
-D.
-0.
~U f
-0.
150.
-0.
-;}.
-a.
50.
50.
-C.
•
-0.
-C.
50.
-C.
A M T AJT 5U.OCOJO FPh l< F P O T « l .CCOOO
D E P E N D c N C r
11 21 241 -0
13 23 243 -C-
2 1 ? 1 1 7 ?1 7\ i
22 212 222 232
23 213 223 -i
-0
-0
-u -0
-0
-0
-0
-0
-0
-0
-0
-0
-c
f
-r
-t
-c
-0
f
-0
-0
-0
-c
-0
-0
-0
—0
-0
-0
-0
-0
-0
-0
-0
-u
-u
-u
-0
-
-u
-0
-0
-0
-0
-J
-0
-0
-0
'-c
-c
.
-0
-0
-0
• -o
-0
NUM JF A V A I L A b l L I T Y CQNUG 10
-0
231
212
213
221
222
223
241
242
243
-0
-0
-0
-0
-0
-0
-0
-0
-n
-0
-c
-c
-0
-0
-0
-n
-c
-c • .
— 0
-0
NUM OF S U C C c S S CLHFiG
A C C U R A C Y , EQUAL T iJ Z £ R O * I F L O W E R
430
TABLE (F4: j (cont'd)
C3hPuTEo R E L I A B I L I T Y I/AU<
jA_V_U_L A_B_I L.I It A L_I i ML C1. 00y 0 H Jlft S <_: ..lOjJlOvO t+01_.
S Y S T E M hAUURt H X O B A 3 f L i T i b S
E X P O ' S USE Tlf iL hUL. P K Q 6 . UNCi fTf A iT PROB. A C C U R A C Y
pE-13
j A y * . i L A 3 I _ L I T T _ A T T I M E "Jp.^OJp H_DUPS<
S Y S T E M F A l t J R E f « f ' 3 A B I L I T I c S
EXPOSUftt TIMh
1.000000
FAIL . PRDb. U N D E T . F A i L PROB A C C U R A C Y
2.0COOOO
3.000000 .10C99356-06 -.35^2 7I^tE-13
5.000000 .10*133226-03 .28068526-06 -.<»263256E-13
AVAILABILITY AT TIME J.CO. 000-1 HGURS< .93165C6-»OQ
SYSTEM FAILURE PROBABILITIES
EXPOSURE TIME FAIL. PROfi UNDET.FAiL PROB.
.2129515E-0<» .11816866-07 -.
ACCURACY
2.000000 ,<»2062636-0<» ,A7277*»9E-07 -.1^213856-13
3.00COOO .6<il3253t-0<t .1C63970E-06 -.355;!71<»c-13
ORiQSMAL PAGE SS
OF POOR QUALITY
TABLE |F4: (cont'd)
.1891 <,OiE-06 -.
.1U71969L-03 .
432
APPENDIX F
REFERENCES
1. R. H. Blazek, R. E. Thomas, R. K. Thatcher and J. L. Easterday,
"TAbularJ>ystem Reliability Analysis", Battelle, Columbus Lab.
AFFDL-TR-71-128.
2. "An Automatic Reliability Mathematical Model", Boeing Document No. D6A-10500-1.
3. Y.W. Ng and A. Avizienis,
"A Unifying Reliability Model for Closed Fault Tolerant Systems"
1975 Int. Symp. on Fault Tolerance Computers, Paris June 18-20.
4. Jean-Claude Laprie
"Reliability and Availability of Repairable Structures"
1975 Int. Symp. on Fault Tolerance Computers, Paris, June 18-20.
38
 .. . 1433
SOLUTION OF A DIAGONAL MARKOV EQUATION SYSTEM
The system to be solved is:
- A T -P = A P with p (o) known
and A =
An
o
0
*12 Ai3
^22 *23
Aln ~1
Ann -
Find a matrix, T, such that:
—1 T
T A T = D
where D =
'Ail
nn
(Diagonal)
Then we can make the transformation
q = f • p
q = T"1 AT T q = D q
with the solution , q. (t) = e "X i i t •
 qi(0)
or q(t) = EXP • q(0)
434
where EXP = DIAGONAL WITH ELEMENTS = e" " *
.'. P( t ) = T\-Jq( t ) = T^-'- EXP ' T'p(O),
T may be determined from:
A • T = T • D
Xn o o "
A22
 o
A12 '•
\ An-l'n-lA13
-
Aln A2n Ann-
~ t ~111
 t o
*21 ' •
x
x
"*11
*21
'*31
-^1
1]
—
nn
22
nn
'
 An-l» n-
A nn
Setting elements i, j equal we get:
n
and since X ^ =0, k > i and t,. = 0, k < j
n.i Kj
+
i-1
?. Aki
435
i >j may be solved by. recursion, arbitrarily starting with tjj = 1., i.e.
* J+l •J = j, j+l • tjj Etc.
436
APPENDIX G
ARCS COVERAGE STUDY REPORT
(General Electric Project Memorandum 76-AR-01)
PURPOSE
The purpose of this memo is to provide a final report on GE's coverage study activi-
ties as required by SOW paragraphs 3.2-4 and 3.6-5.
INTRODUCTION
Using the standard terminology adopted for the ARCS study (reference 1), coverage
is generally defined as "the conditional probability that a stage continues to perform
the required function(s) given a failure (permanent fault)." Note that transient faults
are excluded from this definition, and that the terms "failure" and "permanent fault"
are equivalent within the ARCS vocabulary. The specific coverage definitions of interest
for ARCS have been obtained by applying the above conceptual definition to each of the
stages within a detailed ARCS reliability model. In this context, there is an array of
coverage values, or parameters, which is of critical importance in the determination
of the ARCS functional availability and mission reliability. For this reason, a signi-
ficant portion of GE's ARCS effort has been concerned with the problem of evaluating
fault coverage. The task outline shown in figure G-l indicates the scope of the coverage
study activity.
Most of the significant results of the coverage study have been presented previously at
various ARCS oral reviews with the aid of viewgraphs. The discussion which follows
is built upon much of the earlier viewgraph material and is expanded in some areas.
Generally speaking, the discussion follows the task outline in figure G-l.
437
CO
•Hr-l
X
>034-1COCU
TJqCOQJ00CO5-1QJO^oJ_l0M-ltoco
 
•
•H
 
CO
4->
 q
•H
 
0
q
 
-H
•H
 
4J
M-l
 
CO
QJ
 
4-i
T3
 
Q
JJ_l
r-l
 
(X
QJ
 
5-1
Tt
 
Q
J
0
 4
-i
E
 
C•rl
i—
{CO
 
TJ
O
 QJ
•H
 
4J
4J
 
q
K*
»
 Q
J
r
-l
 
-
rl
co
 
to
C
 
0
CO
 
1QJ
CO
 
5-1
O
 CO
p
^
 
^
<5
 
*T3}_(
QJ
 
CO
X
!
 X
IQJ
5
 C
O
QJ
 
T-l
•H
 
0
>
 C
U
OJ
 
5-1
PS
 
CX
4->q0)c0§•oCJ
 
QJj_i
0
 
3
•H
 
r-l
q
 
-r-4
0
 C
O
5-1
 M-l
4JO
 Q
J
QJ
 
X
I
r-l
 
4
-i
QJ
M-i
 q
O
 CO
•
CO
 
*
 C
O
g
CO
 
4-i
QJ
 
T-l
QJ
 T
J
 
3
4J
 
O
 
O
E
 to
C
 
-H
•H
 
Q
J
 O
5-4
 
0
4-i
 
3
 5-1
5-1
 
r-l
 
0
CO
 
T
-l
 
T
-l
i
 
c
o
 
E
QJ
 M-<
X
l
 
r-l
4->
 
4-1
 
CO
i
 
q
 4J
M
-l
 
Q
J
 T-l
o
 q
 o
o
1
 
O
 
T
-l
QJ
 
(X
T)
4-i
 
E
CO
 
O
 
M
4-»
 
O
 CO
CO
 
nJ
OJ
 
q
 
T
3
X!
 
o
 q
4-i
 
T-i
 
CO
4-i
5
 
C
J
 
M
QJ
 
-H
 
CO
•H
 T
l
 
X
!
p>
 
QJ
QJ
 
5-1
 M-l
5-1
 
a
 
0
TJ
 
CU
 CO
q
 
4-i
 Q
J
CO
 
CO
 
T-l
5-1
 
4-1
>
%
 
T
-l
M
-l
 
Q
J
 
X
•H
 
5-1
 Q)
4J
 
3
 
r-l
q
 r-i
 a
QJ
 
-H
 
E
TJ
 
CO
 O
M
 
M
-l
 O
QJ
X
l
 
00
O
 
CO
•H
 
5-1
XI
 Q
J
5
 
^0
£*>
 
CJ
J_l
O
 
i
—
 1
QJ
 
CO
xi
 04J
r-<
 
O
CO
 
CO
0
 5
-1
•H
 
C
X
4JCO
 
CO
BQJ
 
CO
Xl
 
co
4-1co
 
q
B
 
o•H
QJ
 
4J
XI
 
CO
4-i
 
E•H
CX
 4J
O
 
CO
r-l
 
Q
J
QJ!>
 i—
4
QJ
 
CO
*O
 
CJ
•H
TJ
 
4-i
q
 co
CO
 
-H4-i
QJ
 
CO
r-l
 
4
-i
CD
 
CO
 
•
q
 
Q
J
O
 M
-l
 3
•rl
 
O
 C
T
4-1
 
-H
co
 QJ
 q
5-1
 
co
 
Xl
3
 C
J
OJ
 
QJ
X!
 
QJ
 4-i
4-i
 X
I
4-1
 
4J
xi
 
q
CO
 
CO
 
QJ
•H
 
4
J
 
B
r-l
 
5-i
 co
X
l
 
O
 C
O
CO
 
CX
 
QJ
4-1
 
CX
 W
co
 
3
 
co
W
 
co
 
co
£
"
•HXQJ
I-lCXBoCJq00•HCOQJTJQJI-lXICO4-1•H3COM-l0CO•HXIO•HXIQJ00CO4-iCOXOJr-lCX•H5-44-iCOCJp^<5QJc04-1OQJ 
»
i"H
 
T)
QJ
 
q
CO
 
CO
OJ00CO
 
.
4JCOOJ
X
!4-1
M-40
T)5-iCO0
X
l
TJCOQJ5-14-1COQJ4-1"^%5-1O4-iCOJ-l0
X
>CO
r-lQJ
r-tCXB
•HCOCO
TJr-l
•H3Xl•
"
CO
co
 q
•H
 
0
CO
 
T
»
^
^
 4-J
r
—
 1
 
5-1
CO
 
QJ
q
 co
co
 q•H
5
^
QJ
 
4-1
p
.
 r
—
 |
CO
 
3
CX
 COM-4
X
i
4-»
 
r-l
0
 C
O
X
i
 
3q
00
 CO
C
 B
•Hco
 
q
3
 
0ex
co
 
3
5-iQJ
 
TJ
4-i
 
QJ
QJ
 
CO
E
 
co
CO
 
Xi
5-ico
 
q
ex
.
 o
.
 
T
-l
QJ
 
4-1
00
 CO
co
 
B
5-1
 
-H
QJ
 
4J
>
 C
O
0
 Q
J
.
O
r-l
QJ
 
co
00
 CJ
CO
 
T
-l
4-1
 
4-1
CO
 
CO
•H
QJ
 
4-1
4-1
 
CO
CO
 
4-i
3
 co
I-l03
 
*O
>
 e
QJ
 
CO
•
o
r-lCO
 TJ
 
•
C
 q
•I-l
 C
O
M
-l
B
CU
 
5-1
4J
 
0
CO
 
M-l
r-t3
 
0
E
 B
5-1
 QJ
*s
 e00
TJ
 q
co
 
5-iQJ
CO
 
OJ
4-i
 q
I-l
 T
-4
3
 
00
co
 q
QJ
 
QJ
)-i
q
P*i
 
*H
TJ3
 4
-i
CO
 
Oex
QJ
 
QJ
00
 5-4
COtl
 
_
J
QJ
 
CO
>
 e
O
 
-H
CJ
 
M
-l
T)
 
CO
5-1CO
 
QJ
0
 5-4
X
>
 
co
TJ
 
C
X
CO
 
QJ
 
•
QJ
 
5-1
 
00
5-i
 ix
.
 q
X
l
 
T
-l
QJ
QJ
 
•
 
O
X!
 
co
 CQ
4-1
 q0
 O
QJ
 T-l
 
4-i
4-J
 
CO
CO
 
3
 
4J
3
 
r-l
 
-H
r
-l
 
0
 
B
co
 q
 
xi
>
 
0
 3
W
 
o
 
co
goCOHqCOr-lIX,TJ34->COQJoojCOtoQJoCOWO
ioQJ5-i300•H
CM
DISCUSSION
The ARCS reliability model is based upon a discrete-state,
continuous-time, Markov model for each triplex stage
(3 identical modules operating in parallel) within the
system. With this modeling approach it is possible
to define fault coverages in terms of Markov states and
transitions. Figure |G-2 provides an illustration of the
basic concepts involved'." For each Markov state i there
is a corresponding fault rate X-^ which determines the
probability of exiting the state due to the occurrence
of a (single) fault. Specifically, the amount of time the
stage spends in state i is an exponentially distributed
random variable with mean equal to the reciprocal of X^.
In general, the fault rate \± includes both permanent
and transient faults. However, in this discussion of
coverage the term X^ will be restricted to include only
the permanent fault rate. (Transient faults are covered
by a composite Markov model which uses "leakage" parameters
in a similar fashion to coverage parameters).
Upon exiting state i the stage may transition to any
number of other Markov states. The fault rate associated
with each possible transition path is a fixed proportion
of the total exit rate from state i. Thus, the failure
rate ratio term, r^
 9 defined in figure \G^ 2 is a direct
measure of the proportion -of X^ which is assigned to the
i to j transition.
For a given stage there is a collection of particular
states, the set S, for which the stage successfully performs
its required function(s). Relative to any state i contained
in S, the coverage C± is defined as,
= P /Continued stage success
i
Transition from state
due to a permanent fault
i >
I  )
which may be expressed as a sum of failure rate ratios
as follows,
439
•
 *
.
CU
COfe0Jjl_lCOCo
•H4-1
T
 
1
*
nCOp3CD4J
COpjO•H4-1•HCOCCO4->4J(3CUCOCU}_l
CO4-1
r-43CDM-l4JCCUpCOECUao4-1
 
••^v
4-1CCUCCOCUa^^^CO4-t04-1II
CU4-1CO4-1
•HXCU
r-l3CO
(4-1
•HCU4JCO4-JCOg0V4
Q)
CU
 
3TJ
 
O
(4-4O
•H4J«0CU4JCOCU3
r-l
•Ha
O4JCU
 
CU
4-1
 
4->
CO
 
CO
C
 P-
'
o•H
 
CO
4-1
•H
 
0
CO
 
4-»
CCO
 
CU
^
 3
H
 T
3
•i—)CU
CO4-1COO4-1O
•
rl
4->COr*COH
^
~
J
II•1-1
•H
COEo}.|
H
-lCU
4_)CDP0•H4JCOp;CD^_l4-14J1
rH3<D(4-14->cCUr-»Permai
COE0J_|
(4-1CU4-1CO4j•HXCU4->
r—
l
 
•
3CO<4-l
4-)CU{3CO^ flja*
 
J
4J
 
-r-J
o
 
-H
!
 
-H
H
 
/<
 /<
II
 
U
£?
 
£?
CO4JCO
CO4JaCU0C00HCU"Oo^>o«J
Co.-
 I
•p
i
4JC•HU-4(UQCUM)CD?-i<u>0CJJ-lo(W;TtoCOCOCUoCU
O
J^1
 1
piCU}_l3W)•Hfn
440
J€S
(1)
Thus, coverage relative to state i is Simply a measurement
of the total proportion of the exit rate \^ which is assigned
to state transitions that maintain stage success (i.e. all
i to j transitions where j is contained in S). From figure|G-2i it follows that Ci may be expressed alternately as,
JeS (2)
For most of us, these equations do not provide much intuitive
feeling as to the real hardware oriented meaning of coverage.
Perhaps the first obstacle is the definition of the Markov
states relative to a particular triplex set of modules.
For discussion, let's consider the four most obvious states
for a triplex module set: operating with none failed (triplex),
operating with one failed (duplex), operating with two
failed (simplex), and not operating and/or three failed
(stage failure). If the failure rate of each module is X,
then the exit rate from the triplex state is 3X. Similarly,
the exit rate from the duplex state is 2X, and from the
simplex state is X. Now, suppose that some proportion,
r21» °f the exit rate from the duplex state corresponds
to a transition to simplex operation, and the remaining
proportion, 1 - r2]_, corresponds to stage failure.
Figure JG-3 provides a state transition diagram for this
example.
441
Triplex
operating
state
Duplex
operating
state
Simplex
operating
state
Stage
Failure
X20
X = module failure rate
Xo = 3X
2X
X21
X2Q
2Xr21
2X(l-r21)
Cl = =1
X3
c2 . gi ..
Simple Triplex Stage Model
Figure (oF.
1442
It follows immediately from equation (2) and figure
that,
First failure coverage C]_ = 1
Second failure coverage C2 =
This simple triplex example has so far shown that a Markov
stage model may be used in a rather straight -forward
manner to obtain definitions for two important coverage
parameters. However, these definitions are still somewhat
lacking in intuitive appeal. The most common difficulty
in this area is to relate the more familiar concepts of
percent fault detection and isolation to the concept of
coverage. In the .preceeding example this involves just
the duplex operating state, where an additional failure
can only be tolerated if it is detected and isolated,
so that successful redundancy degradation to simplex
operation is achieved. Through the use of comparison
monitoring it is easy to see how each module in the
duplex state can detect a fault occurrence. Further,
with the addition of a self-monitoring capability within
each module, it is evident that some percentage of faults
local to a module will be successfully isolated. Thus,
in this example at least, the intuitive point-of-view
might be that second failure coverage, and the percentage
of second failures that are successfully detected and
isolated, may be somewhat the same thing. It turns out
that there is a very simple relationship between these
two measures of fault tolerance which is of critical
importance in the fault coverage evaluation problem.
This relationship can be developed with the aid of the
following definitions; let,
X^j = the mean failure rate of the N permanent faults which
can cause a transition from state i to j
A.^ = the mean failure rate of the total number M of permanent
faults which can cause an exit from state i.
These definitions of course assume that the number of
possible permanent faults in the modules under consideration '
is finite. For a given state i it follows from figure (2)
that r^i may be expressed in terms of the above mean
failure rates as ,
N \ . I • -J \ ro\
The ratio (N/M) is a positive fraction (a number between
0 and 1) since N is always less than or equal to M.
The ratio ('Vi.jAi) is a positive number which, in general,
may be greater or less than one. The product of the
two ratios is guaranteed to be a number less than or
equal to one, because N ^ ij = Xjj , and this term is .
always less than or equal to M Tj. = XL por discussion
purposes it is convenient here to consider the simple
case where in equation'(2) Ci = rij. Then, the ratio
(N/M) expressed as a percentage may be viewed as the
percent of "covered" failures, meaning those failures
that are detected and isolated. So that one may write,
%rij = (% of "covered" failures) ' ^
Expressed this way, it is clear that the ratio of mean
failure rates provides a weighting factor which increases
or decreases the actual % rij • Thus, the most common
intuitive view of coverage may be placed in agreement
with a Markov modeling approach by the simple addition
of a failure rate weighting factor.
At this point, the central issues of the coverage evaluation
problem are in evidence. To evaluate any C^ parameter
it is necessary to know the following:
• the total number (M) of possible permanent faults
in the modules under consideration.
• the failure rate corresponding to each permanent fault.
• the transition result from each permanent fault occurrence
!444
Knowing these things the terms N, Tj^ and T^ are known
for each r^^, and hence for Ci. The balance of this
memo describes GE's study efforts to find practical ways
to obtain the above data.
Failure Rate Data; Sources and State-of-the-Art
There are several accepted sources from which to obtain
component failure rate data. Given a complete component
parts list for the module under consideration it is a
simple procedure to estimate the total module failure rate.
However, this data alone is not sufficient for undertaking
a coverage evaluation. What is required is data concerning
the failure rate attributable to each of the possible
failure modes of each component. Only with this type
of data can one begin to determine the total number of
possible permanent, faults in a module, and the failure
rate associated with each fault.
An industry review was conducted to determine the
availability of the needed data. The following were the
primary sources considered:
• MIL-HDBK-217B (and related MIL-Specs)
• Failure Rate Data Exchange (Formerly FARADA)
GIDEP - Government-Industry Data Exchange Program
• Reliability Branch - Reliability Analysis Center (RBRAC)
USAF - Rome Air Development Center
• ARINC - Aeronautical Radio, Inc.
In the general area of military standards and specifications,
figure iG4 summarizes what is available. MIL-HDBK-217B
is the primary source today for electronic component
failure rate data. This handbook does not provide data
for component failure modes. None of the military
reliability and/or component standards appear to be concerned
with this type of data.
i445
00
 
r
-4
IUIr-l
CO
 
*£
>
 
»
*
 CM
O
 
O
 O
 
r
-4
 C
N
O
 O
 O
 O
 O
CT>
 O>
 
ON
 
CT>
 CT
*
CO
 C
O
 m
 
C
O
 C
OI
m
 
r
^
 oo
 o>
 I
s
-
 CN
O
 
O
 
O
 
O
 
r
-
l
 
0
0
O
 O
 O
 O
 O
 
r
-4
co
 co
 co
 co
 co
 in
i
 
i
 
i
 
i
 
i
 
i
rl
O
 
0
4J
 
-
^
TJ
•d
 
c
0)
 CO
4Jco
 
n
r-4
 
0
CD
 
i-(
pi
 
4-*O
CO
 
1-1
^3
 
*O
J4
 
Q]
«
•
 
.
H
•O
 
PL
,
to
 
cu
4->
 
&
CO
 
4J
of Equipmentao
•H
 
>
4J
 4J
CO
 
-rl
V
»
 
i-l
4J
 
*r4
CO
 
^K
C
 CO
e
'rl
r-l
QJ
 
0)
Q
 P
i /
V40U-l
 
4J
B
 CU
to
 
B
n
 
a
00
 T4
0
 
P
<m00JN
^
1
i
 
Q
-
 
•
•
 .
CO1,Jr-ls
V4
cx
,
>
,
4->*r*4
r-l
•r-l*-i
4-JCO
•H1-1CUPi
cr
uT3cCOCO60)4-1COr>%CO
d ProductionCCO 
)/
c
 /
0)
 
/
B
 /
r-l0)>0)o //
<ajvOmF^1Q
m
-
 MCO1rJMg
73rlCOT3Cto-UCO
edictionV4Pufx,4J
•rl
PQr^r-4CN|
edictionV4P-l>^4J
•rl
>
>
r-l
 
^
 r
-4
mV
O
4J
1-t
r-4iH2
•rlCO
•Hr-40)Pi
«ffiI,Jh-1s
•H"S•rlr-lCUPi
FD-781A
sility Tests:
antial Distribution
Equipment
i
I MIL-S'RelialExponi
u•Hp5o^44JO
41u4-4O
qi0)V4PW)•H
446
For later reference in this memo, it is significant
to note here that the "Established Reliability" (ER)
specifications provide failure rate values that are guaranteed
with either a standard 60% or 90% confidence level. This is
the best failure rate guarantee available. MIL-HDBK-217B
provides only a "point estimate" with unknown confidence
for component failure rates.
The FARADA/GIDEP data exchange is a significant source
for actual field data on component and subassembly failure
rates. Unfortunately, failure mode data is not available
from this source. ARINC was contacted because of their
aoparent activity in the general reliability area (reference 3),
and their extensive involvement with commercial aircraft
electronic equipment standards. While they were once
(early 1960's) engaged in reliability work, they are now
no longer active in this area. ARINC is therefore not
a source for any failure rate data.
The Reliability Analysis Center (staffed by subcontract .,
personnel from IIT Research Institute, Chicago, Illinois)
at RADC provided the best failure mode data (reference 2)
that was found in_the review. This data was for microcircuit
devices. Figure [G-5J shows a table which is excerpted from
reference 2. As indicated in the figure, the functional
failure modes for digital TTL devices may be roughly
categorized as 75% "stuck-at" faults and 25% "logical" faults.
This means that I/O signal pin faults corresponding to
"stuck-at-1" and "stuck-at-0" conditions are the most
probable TTL failure modes. The tyoes of faults in the
"logical" fault category are those for which the device
performs the incorrect logic function, without any
"stuck-at" condition appearing on the I/O signal pins.
For example, a failure internal to the device may cause
a JK flip-flop output to toggle when it should stay set
to one state.
While the Reliability Analysis Center (RAC) data is in
the right direction, it is hardly differentiating enough
to distinguish the functional failure modes of a 4-stage
synchronous binary counter from those, say, of a look-ahead
carry generator. Considering the complexities of many
MSI and LSI digital microcircuit devices, it is obvious
that such a vague characterization as the "No Output" shown
in figure ;G-4 is a very gross functional failure mode description.
• 447
0HC
*
HCO&«HWEHttOHOH
.
.
PACKAGE TYPE
0
.
HQCO04HQW04HOsuo<HQUr-tsPn<§04&4U
m
 
o
^
 
IN
 
*?
 
'
 
ro
H
 
•*
o
 
e
n
 
C
M
 
o
 
'^
H
 
H
 
CM
 m
n
 
•*
 
o>
CMCM
 
H
.X
^X
 
H
 
/
n4J
 (U
i-l
 
.C
§
H
<w
 
.
I-l
r
 
cd
4J
 
4J
co
 
o
1
 
-U
O
 
'H
3
 0
4JW
 
^2
r
 
co
C
 co
•H
 
r*
.
a
^
r-l
 
O
cd
 (4-1
bO
 
-U
•H
 
C
co
 
3O
O
 
0
-
-
.
 O
M
 
cd
/
remaining 26.7% of the faults
fall into the general category
of '."logical faults".
igure jG-5 .
toWuHs*
CO
CMro
CM
COCMH
CM
 
CM
M
 
0
3
CO
OCOH
o(X(U
HEH
W04O
Oa
ugH
EH8
448
Nevertheless, the RAC data does provide a basis for making
some assumptions for the purposes of analysis. The percentage
of "stuck-at" faults may be assumed to be uniformly
distributed across the I/O signal pins of a device.
The percentage of "logical" faults may be treated by
making subjective assessments as to the functional failure
modes which are in some way "most critical" to the analysis.
The outcome of the failure rate data review may be
briefly summarized with the following observations and
conclusions:
• Current reliability analysis and testing of components
is almost exclusively concerned with measuring and/or
predicting the mean life parameter of the exponential
model.
• The apparent state-of-the-art does not include the
meaningful evaluation of the relative distribution
of component failure modes (this is especially the
case for microcircuit devices).
• The best available data for microcircuit devices is
primarily oriented to physical rather than functional
failure modes.
• With today's data base, the exact evaluation of
coverage in terms of the failure rate of each
.permanent fault is impossible.
This last conclusion is significant and unavoidable;
the necessary failure rate data for all types of components
and their failure modes is simply not available. Further,
it does not even seem reasonable to expect that such data
will ever become available. It is highly unlikely that
anyone will ever attempt to identify, and estimate failure
rates for, all possible failure modes of many types of
complex electronic devices. Clearly, to evaluate fault
coverage it is going to be necessary to find ways to
overcome the absence of very basic data. This problem
will be addressed in a later section of this memo.
449.
Failure Mode and Effect Analysis
In addition to the requirement to know the failure rate
assigned to each permanent fault, it is necesary to
know the effect of each fault. Generally the procedure
known as Failure Mode and Effect Analysis (FMEA) is used
to' obtain such information. This procedure has a history
of application to many types of systems where the performance
of the system under failure conditions is important or
even critical for safety reasons. Typically, FMEA involves
the tabulation of all possible faults, or failure modes,
and the systematic evaluation of the effects of each
fault considered one at a time or, in some cases, in
combinations. While the method is conceptually simple
enough, in actual practice there are many difficulties
which arise, especially for a complex digital system
such as ARCS.
The major difficulty concerns the sheer magnitude of
the task. A simple numerical example will quickly
illustrate the point. Digital microcircuits are most
commonly packaged in 14 or 16 leaded dual-in-packages
(DIPS). Considering I/O signal pin "stuck-at" faults
typically results in 24 to 28 fault possibilities per DIP,
excluding the power and ground pins. When "logical"
faults are added, this number may range from 24 up to 36
possible faults for the typical 14 or 16'pin DIP.
A digital computer with CPU, I/O section, and memory may
easily contain 500 DIPS. At an average of 30 faults
per DIP the total number of faults -evident on pins alone
is ISpOO'. When boards, connectors, discrete components,
power supplies, power runs and connections, and other
miscellaneous, parts are included, the number of possible
faults in a digital computer is in the vicinity of 15,000
to 20,000 faults.
Not only is the number of faults potentially overwhelming,
but in a programmable digital machine the number of
possible effects ranges from the obvious to the extremely
subtle. Certain faults in the hardware will have several
different effects determined by the resident software,
and the operating state when the fault occurs. For these
reasons FMEA, as a manual paper and pencil method, is
an enormous undertaking when applied to a digital system.
Computer-aided analysis and simulation is useful in some
areas for FMEA, but, because of the need to consider
operating software together with gate level hardware
models, computer aids are quickly limited by computer
execution time. For example, to simulate one second of
real time operation when the gate level model uses a
4MHz. clock, requires 4,000,000 simulated clock intervals.
With any complexity at all in the hardware model (say
5000 equivalent gates) it is likely that each simulated
clock interval may require on the order of 25 msec of
simulation running time. For 4,000,000 clock intervals
this amounts to 100,000 seconds, or more than one day!
Obviously, if 15,000 to 20,000 faults are considered one
at a time with such a simulation, the computer running
time involved is absurd.
Clearly, the use of a conventional FMEA approach for the
purposes of coverage evaluation is - at least for a digital
system - impractical. Consequently, the problem of
coverage evaluation begins to have a rather formidable
aspect to it. The required failure rate data is unavailable,
and even if the data were available, the FMEA task appears
overwhelming. Faced with this state of affairs, a new
approach to the problem is necessary. The following
sections describe GE's 'efforts in this direction.
Key Assumptions Leading to a Statistical Estimation Approach
Recalling equation (3) and the definitions of the associated
terms, the coverage evaluation problem involves a set of
M faults, with some subset of N "covered" faults which are
to be appropriately identified. Since the failure rate
corresponding to each fault must be considered largely
unknown, it is reasonable to assume that,
The set of "covered faults" constitutes a random
sample of N failure rate values from a total
population of M failure rate values.
451
In other words, if the failure rate of each fault is
unknown when the failure detection and isolation mechanisms
are designed, it is unlikely that the coverage or non-
coverage of faults will be dependent upon their failure rate.
Therefore, the mean failure rate (^ ij) of the subset of N
faults may be viewed as a sample mean- from the failure
rate population of M faults (with mean ^ ). The "Strong
Law of Large Numbers" guarantees that.the sample mean
approaches the population mean for large N. Further, the
Central Limit Theorem guarantees that the distribution
of the sample mean approaches a normal distribution
for large N.
The significance of the above assumption, and its
theoretical implications, is that it provides a basis for
considering a simplifying approximation for coverage.
This approximation is motivated by the following observations
concerning *i ,
for (4)
JN_
M
for < A- (5)
Equation (4) provides a lower bound on each r^j term which
applies whenever the indicated mean failure rate relation-
ship is satisfied. Since coverage, C^, is the sum of
selected rjj terms, the lower bound in equation (4) leads
directly to a lower bound for coverage. Thus, for the
simple case where Ci = rj , this relationship holds,
JL
M
< 1 for \ij > (6)
Now, it is evident from (6) that a conservative approximation
for coverage may be obtained by evaluating the ratio (N/M),
provided it is known that X^j > T^. This is a very
significant observation because it means that coverage
may be evaluated without the need for failure rate data
for each fault, if only the relative magnitude of two
mean failure rates is known. On an absolute basis,
of course,' if the data existed to determine the mean
452
failure rates then the failure rate of each fault would be
known as well. However, on a probabilistic basis it is
possible to directly evaluate the relationship between
the two mean failure rates using the key assumption
stated above and, in particular, the Central Limit
Theorem, In general, the probability question of interest
is,
£ k ( JL)}- P { *tj >kX.} - ?
where 0 < k < 1. Considering Xij as the mean failure
rate for a sample of size N, then by the Central Limit
Theorem,
. > - •
large N (7)
where z is the standard normal variate, and cr^ is the
standard deviation of the failure rate population.
Figures G-6 and [G-7J illustrate the numeric consequences
of equation (7). In figure G-6j the important trend to
note is that as the ct term increases the probability that
Xij > k X~i approaches one for k values closer to one.
For example, the probability that Xjj > .99 X^ is .977
for a - 200. Note, however, that regardless of the size
of a all curves go thtough .5 probability when k = 1.
In figure |G-7' it is apparent that as the failure rate
distribution becomes more highly peaked (small variance
relative to the mean) the a term stays in the order of
several hundred for lower values of N, In this regard,
it is critically important to recognize that the sample
size N corresponds to the total number of "covered"
faults (in the X^^ category) for the module under evaluation.
Thus, even though the number N is used in the sample
context, it is not unreasonable to suggest that values
of N &£. the order shown in figure |G-7 are possible.
This is especially so when one considers that the number
of faults which are typically tabulated for FMEA purposes
includes many faults lumped into equivalent failure mode
classifications. For example, the I/O signal pin "stuck-at"
454
QJW)
•H
-po
r
M
7
10
TT
^
C
N
^3
;
&/Q
,
*
^
•
\
.
t^
,
^
.
 iV5
/O
L^3
/A
*
10
 .
\\
\
\
'\\\\\
\\\\\
*\\
\\\
-
.
\
y^
\
^\\\\
\
^T
LP
_
pbO•H
-1
/.o
¥.*
455
faults discussed earlier are actually lumped failure
modes. Any number of specific fault possibilities may
be involved with the occurrence of, say, a "stuck-at-0"
condition on one output pin. In this sense one may argue
that the number of faults N (and necessarily M) is virtually
infinite.
Certainly it is highly desirable to make the type of
approximation for coverage indicated in equation (6)
above. With the added multiplying factor, k, less than
one, the probability is quite high that k (N/M) is a
conservative approximation for each term rjj , and hence
coverage. What constitutes a "high probability" is .of
course the central issue. A necessary perspective on
this issue is found in the following two considerations:
* (As indicated earlier) basic component failure rate
data is either given with unknown confidence, or,
for established reliability parts, at a standard
60% or 90% confidence.
• Because of the large numbers of faults involved,
practical analysis limitations-dictate that even
the ratio (N/M) would be estimated from a "sample"
of faults which is assumed to constitute "all possible"
faults.
In the author's opinion, the first consideration alone
is sufficient to justify the use of the suggested approximation
for coverage. Considering that much of the necessary
failure rate data is unavailable, and that the data that
is available is at best estimated to a 90% confidence,
there seems to be no point whatsoever in attempting to
evaluate coverage with any more than 90% confidence.
The second consideration makes the additional point that
even if the suggested approximation was an absolutely
valid worst case approximation, the evaluation of the
ratio (N/M) would, as practical matter, still take the
form of an estimate.
456
For all of these reasons the following additional
assumption is accepted for the purposes of coverage
evaluation:
It is assumed that in all cases ^ jj' > ^j by purposeful
design, or, that N is sufficiently large so that
Xij > .99 TI with probability approaching one.
The purposeful design aspect is always a possibility,
since a designer who had any idea of the relative failure
rates would try to "cover" all of the higher failure
rate faults. With the two assumptions described in
this section the coverage evaluation method will now. be
described and the results of a laboratory evaluation
discussed.
The Statistical Estimation Approach
The preceeding assumptions set the stage for the evaluation
of coverage in terms of the approximation that,
This approximation eliminates the failure rate data
problem, but still leaves the difficulty of a potentially
enormous FMEA effort inorder to evaluate the ratio (N/M).
Clearly, in those cases where "all possible" faults, M,
can be tabulated, and all of the "covered" faults, N,
easily identified based upon their failure effects,
FMEA is the logical and straightforward approach.
However, in the case of a digital system such as ARCS,
the enormity of the FMEA task, as already described,
suggests that a statistical estimation approach be employed.
The essence of the approach is to use a randomly selected
sample of faults to estimate the ratio (N/M) for each r^•
term of interest. The applicable sampling theory is J
derived by simply treating the ratio (N/M) as the proportion
parameter of a binomial distribution. The estimation of
a proportion is a standard statistical problem which has
been well treated in the literature (see for example
reference 4).
[457
•ffgr^
The necessary theory generally follows the lines that
the cumulative binomial distribution can be adequately
approximated by the cumulative Poisson distribution,
which, in turn, can be evaluated directly in terms of a
chi-squared distribution. The result is that a lower
limit for the estimate of (N/M) is given by,
Lower Limit (N/M) = 1 - 2n
where,
n =fault sample size
r =number of "uncovered" faults in sample
2-
X l-oi = Chi-squared variable with 2(r+l) degrees of freedom
evaluated at a cumulative probability of I-Q.
Figure Cr-Sj shows the lower limits obtained for representative
sample sizes and confidence levels
In practice the fault sample may be selected several ways,
depending mostly upon the requirements of the particular
activity concerned with coverage evaluation. The sample
faults may be evaluated on paper, with hardware simulation,
or with fault insertion testing. The selection approach
most in line with conventional FMEA is to construct a
table of faults to be considered, and enumerate them.
A random number generator is then used to select the
desired sample. Another approach is to enumerate all
components first, randomly select a sample of components,
then randomly select the fault within each selected
component which is to be used in the final fault sample.
This approach has the advantage of not requiring the
construction of a complete fault table. To keep this
approach from introducing any bias in the results, it
would be necessary to assume for selection purposes that
the relative number of faults per component is proportional
to its total failure rate; or, each component would have
to be weighted for selection according to the actual
number of faults which would have been used in the FMEA
table approach.
458
oo«nIIC
4J
B-S-H
o
.
 &
ON
 T
-l
hJ4->
6-2
 
-H
0
 &
VD
 
-H,-J
m
 
CM
ON
 
ON
ON
 
ON
•
 
•
CO
 
VO
ON
 
ON
ON
 
ON
•
 
•
ON00ON
•
»tf
ONON
•
r^ooON
•
CMONON
•
<fooON
•
ON00ON
•
r-t
COON
•
r>
.
ooON
•
0)N•HCOI"tOCOto
OOCOIIe
4J
6*2
 
-H
0
 g
ON
 
-HnJ•U
S-2
 
-H
o
 e
vO
 
-HnJ
CM
 
r^
ON
 
00
ON
 
ON
•
 
•
r>
-
 
co
ON
 
ON
ON
 
ON
•
 
•
CM00ON
•
OONON
•
00r>
.
ON
•
vOooON•
CO1^ON
•
CMOOON
•
ONVOON
•
ONr*
-
ON
•
OOiHIIc
4J
6-2
 
-H
o
 e
ON
 
-H.-14-t
B*2
 
«H
0
 g
VD
 
-HHJ
r-
.
r~
-
ON
•
r-4
ONON
•
i-lVOON«
000ON
*
r
-
»d
-
ON
•
ONVOON
•
COCOON
.
00mON
•
OCMON
*
1^
-
<fON
•
t^OON
•
r^COON
•
s"&14O
m0)
O
CO
 ON
•H
 
'O
 
<U
6
 
C
 O
•H
 
to
 
C
i-J
 
0)
B
'S'O
to
 O
 
'i-l
<1)
 VO
 U-l
S
 
C
O
 
4->
 O
iJ
 
«0
 O
o(UtoD00•H
«W
 
Q)
 
C
O
 
to
 
-H
 
0)
Q)
 
i-4
to
 >
 
to
 
o
Q)
 
O
 4J
 
E
43
 
O
 
r-l
 
«0
6
 
P
 D
 w
.3
 3
 to
23
 
=
 
M
-l
CM
CO
'459
OLaboratory Evaluation
Inorder to demonstrate the statistical estimation method
and compare it to the FMEA approach, an ARCS triplex
stage (which was tractable for FMEA) was selected for
laboratory breadboard evaluation of coverage. The stage
was the servo stage including the triplex servo actuator,
servo electronics, and computer output section shown in
figure G-9. A quadruplex 680J force-summed actuator
with one channel bypassed was used for the triplex actuator.
Analog servo electronics (per reference 1) were built
up, and a (Intel 8008) microprocessor setup with real-time
I/O capability was used to simulate one ARCS computer channel.
4 The details of breadboard operation and the coverage evaluation are presented
.] in the form of the viewgraphs used previously to report these results. Figures
;' G-li through G-25 contain these viewgraphs.
'•As can be seen by the results in figure JG-25 a second
^failure coverage of approximately .95 at 90% confidence
was obtained — not including the additional .99 factor
in equation (8). (When this factor is added the coverage
<'isi reduced to about .94). In all cases shown in figure
{G-25 the statistical results are more conservative than
, the FMEA point estimates.
L_
460
c<t>s&rtIC3 g
 Ert&
_
 
rt
•gaK xto g
|a
461
c< sioo eVO OfXF-l *CO >3 0P vI
-f
 CNJ
 <
CU
 
r
-
C
 
fl
C
 
+
«0
 
f
J
l
 HARCS Computer C
Si'miil af-orl T.T-J +-V,
Wavetek
O4 *>« *»T
04J
:>
 
CO
-1
 
4J
I)
 
U
0
 
<
V
^
f
itronics
-)
 
0)
J
 
r~1
3
 
W
is!g4JCOCO^D J-l
J
 
0
:
 
c
o
^
 
coaooadoao-pw 9008too4J
^
 
co
0
 
Vi
:
 
cu
-
JD
 
a
•4
 
(!)
0
 
0
<CJ
CO <oCO
'
ooovOr-4COPPQ
"
O
1
 
4-1
O
 CO
>
 
3
 
>
->
to
 4J
 
o
eu
 
o
 
'
 
o
o
to
 
•<
 
yo
^
v
'
^
T
 
'
 
r
 i
CO
 
°
u•r-l(3Oto
O
 
4J
r
*
 
O
 
r-<
}-l
 
CU
 
CO
<1)
 i
—
 1
 
3
CO
 
W
 
p
PQ
 
PQ
P
 (
^
 
C
J
 
>
 
0
13
 
<
 
C
O
 O
 
C
J
CJ
.
 CO
A
.
 
A
O
 
r-4
r-t
 
r-l
CO
 
CO
Servo-
s
N
/OfllCO
,
.
Vio4JCO34JU
<
v
 
•
CO0
•HC0to4-10)W; (>
CJ
 
cj
C4
 
C
J
 
>
<i
 
C
O
 O
CJ
 CO
N
ic
o
r-<
 
r-l
-
•IP60(0•HPUOr-<PQCOOCOCU14O4JCOCO
sonwAiE nocEssiNG
n< >
SERVO CMO c *
I SUM A f>l-f>
ISUMB P1-S>
ISUMC Pl-5>
PI-O
A/0'4
-0-POS.i'
pis? ^1
SEKVO CMO IA
, Se«VOCMD IB
*PI-W .
•5
50V 8 PI -M >
SOVC PI-N>
»5 »5
DO'O
DO '1
DO '2
)PI-U
4PI-V
°/
T•5 »s -S -ir
SIM. WATCHDOG B
AUTOIAtft
SIM. WATCHDOG C
AUTOUEC
S10J'-: / s u /si: /
01 '10
01 'II
01 '12
DO '10
.
DO 'II
DO '12
DO '1 3
DSIO OSII OSI2 OSI3
463
Figure
30oto
 
»
•
evOlO
r
 
^
O
PO•i-l
4J
 
4J
CO
 
P
?
.4
 
3
co
 
o
-u
 
o
i
-
-
V
,
-I.
I-lCO6
 
—
 "
oo
1
P3O0CO
 
•»
•
Bvo
•
in
Po•I-l4J
 
4J
CO
 
P
M
 
3
CO
 
O
4J
 C_)
M
.
*
*
*
*
r-4
1-4
 
CO
O
 
M
x:
 co
co
 
9
-
5-1
 
0
X
!
 0
HPQ•<^N. Cx
.
COg
 
*
•
oo
y^
 
/
T3
 
t
CO
 
<U
60
 
61
CO
 
CO
60
 
61
P
 
C
PQ
.
 
CJ
-at-i
 co
0
 
>
-l
Xi
 
co
CO
 
CX
co
 
B
M
 
0
x
:
 u
H-^O
CO•U3
 CO
t
-^
 
3
0
 i-l
co
 
co
CO)^3
 CO
•
-I
 3
0
 r
-t
CO
 
CO
t
ivooo<roCNCOgCOB00u'OoCOCO3t-lCO
ieninor-loo<roCMCOoo COorO
CUp
 II
M(U
 
4J
in
 
3
CO
 
O-"
2
 p
x:
 co
co
 
e
CO
 
g
M
 
o
x
:
 o
H
o
3
 (U
i-l
 CO
•HCO
 
O
oJ-l
•Hpo3C_>B3•HOOCOppCOCOCOo
CO360
o<0m
.
 
-
o
N
 (0
p"j
 
Lj
vO
 O
ii
 ii
^
 3
Mo4-1
i-l
 CO
CO
 
}-l
4-1
 
P
 
<U
CO
 
00
 
C
CU
 
*H
 
CO
H
 
C
O
 O
<
 r
-l
^
>
 
-i-l
Q
 C
OO
u
O
 
O
^
-
.
 O
<
 >
-l
1oCO
O
 
M
JS
 
CO
CO
 
C
b
cu
 
B
M
 
o
^
 o
HCU4-13
 C
U
r-f
 
3
O
 t-l
CO
 
CO
%
>
B>S
vOe-s
oo
I00
aooCOcuHM-4CU
Q
WfeH-4O
e-s5T3
:
r-lOCOCU£HCUp•Hfe
465
io
 
sNW
«—
»—
W
—
^
466
pq
 QJ
 
*J
 
«w
M
 p
 
w
CO
 
O
 
C
X
 0)
O
 
C
O
 
-U
 
*J
O
 
'H
 
3
 C
00
 Q
 O
 M
o4->
:acu.coCOCOo•HCo9-1OCU1-1WCUCOo0301
p;cup(JO•1-1
.
 
'467
477
"
4->UCUCO
•HCO
r-4COPPCOpO•HCO4-1COPOgCUQCOCUtoCQ
design concept through laboratory test.cu60CO4JCOo^tocuCOCOQJ(V?
^cu424-1^>
U-l
•rlrlCUO
H
rage (the duplex-to-simlpex transition
stage using two methods;
cu>oucuto3I—I•rlCOM-lT3rjOCJcuCOcu4JCOr-lCOC^UoH
otocuCOCO£Jpj<jcu424Jrl0>4-ls~\0•rl4->COrlCU4-1COto
toopCOCOCO-o~<ur-4COr-l•HCOCO4->COCUPOa3CUCOCO43<*F,pmr-<COPO•H4-1PCUF>oCj•CO
ns" concerning the assignment ofo
•H4-i(Xg3CO
.
 COCOcur-l
43COPOCOCOcuto
component failure mode .42CJCOcuo4->COCU4-iCOtoCUto3
r-l
•HCO
CM
n based upon manual insertion ofo
•H4JCOg•H4JCOCUr-lCOCJ•H4-1CO•H4JCO4-1CO•43
Its and the use of an approximate3CO
<4-4
TJCU4-iOCU
r-l0CO>
,
i-lgOT3PCO^
•
r-lCU
T
JOg
424-iCOg
lusions concerning the coverage evaluationCJp0oC^Oto
*"O13pCOCO4Jr-l3COcutoUCD•rlC^UtooH
•
>•>600
r-l0
*oo424JCUg
O
iCUrl3
.
60
•
rl
468
CUr-4&COPCOPCOo
U
-lCQCU
r-lPO^1o
CU
 
•
CO
 
*OCU
*O
 
M
}-i
 
CU
CO
 
TJ
O
 
-H
,£>
 
CO
•o
 p
CO
 
O
CU
 
O
5-1
&
 C
U
CU
 
Q
4-i
 
CU
P
 
P
•r-(
 
4-1
rP
 
O
•U
 
P
•H
 
>
-l
£
 4
-
»CQ
CQ4J
 
>
>
r-4
 
O
P
 
P
CO
 
CUq
cu
 
cu
co
 
Cu
o
 cu
Po
cubOCO
CO
cuUpcuCOco
 
"cur-4
cu
CO
cue•r-l4JCO4_>CO(UP0
*ocucuT)•HCOPOUcuj-iCOCO4Ji— 1PCOpn
p0•I-l4J•rlT3P0UbOP4-iCO5-1CUOXcut-l("V3X) 
•
•
••r-lCO0
•Hp
,
?•>
4-1
—
cu^ •r-lJ_|CU
*ocuJ_lCOCOp0•H4JPJ•HMOCOcu'OcuTjOgCUJ-lpr-l•HCOfe
r;4JP
•r-l
T3PCO«CO
4JCO
T3
t
 
—
~CUr-lnCO
r-l
•HCOC^O4JCOCU
,0CU4-1
COPOCQCOCU
_J-IPOCuP
T3(UCOCOn(U}_iCO«CO4-1CO
T
J^>P
COM-lO
•
CQ4-1PCUgCQCQCUCQCQCOW)P
•HpCUcup•HbOpcu
oo
cuI-lX)COHr-lCOP•H*rjCUJ_lCU
*O•HCOp0C_3COCU'O0CUtopI-l•HCOft,
r-l
4-Jtoo
,pCO-opCO/—
 \
6~S
•^3^
^y\
V
—
 '
pCUao.
.
COrlO4JCO
•HCO(UPi
o*^C7^
x_
x
•Url0
,CCO'OpCOx~sg~Sr-lN«
 •
PCUa0•
 •
COJ-l0
4-J
•HUCOaCOu
v
o4Jtoof,CO*opCOx~vg-SCOCOI^S£(Uao• •COcu*oo•HQ
u->
 to
4->O
 CO
x
:
 
a
CO
 
O
$-1
 
to
CU
 
CO
4-J
 
4-i
4J
 
4-1
•H
 T
-l
&
 E
w
 w
I
 
1
o
 
o
4-1
 4J
1
 
1
t
 |
 
t
j
0
 
0
4->
 
4-i
0
 
0
CU
 
(U
i-l
 
r
-l
r-l
 
r-l
0
 
0
0
 U
•
 •
COto04JCO•1-1COpCOtoH
s
~\
ba
-
pocu•^H4J•HCOOp
 ,
tocuo^*X3toCOK.
.
COag^3ao
-^N
bCM^^4-1PapoCU•^rl4JCObOCU%£tocuo^T3rJCOPC!
x
-^/
—
 s
b
 b
IO
 
r-l
4->Pa4-1P0rjCUaO
 4
-
*
P
 
«
H
•H
 
X
CO
 
CO
,—
1
^
 
r
-l
0
 
P
IO
 
IO
P
-^
 CM
CQ4J
r-l
 4
->
P
 
P
ca
 
a
[•T
,
 
4JP
=
 
O
4JCO
 
P
1
 
O
U
 4J
P
 
r
-l
•U
 
P
CO
 
CO
—
 
F Ti
P
 r-4
•H
 CO
CU
 
O•H
O
 W
)
-
»
.
 O
M
 n
J
•
 •
O-^1bOOi-l-J
pH
O0r-lPCUaoto
•HCOp
 ,
P•HP-lCUbOP
•Hto•
 •
COrlO4-1OCUPPO
u
treated according toCQCUu•Hc^u
T3rlCU
X
Io
I-lr-4COT)PCOCQCUr-4PO&cubOtoCO,_]•
g to subjective
s.
p
 cu
•H
 T3
*o
 
o
to
 E
00
 4-1
O
 
r-l
CO
 
PCO
ti
 
U
j
r
^
 
i
 i
0
r-l
CO
 
co
4-»
 
P
CO
 
0
*X3
 
"r-44J
(U
 
CJ
r
-l
 
P
X
I
 
P
CO
 
M
-l
r-l
-
_
J
 
1
1
1
~
 
*
T
^
CO
 
O
C^O
 
4JP
4->
 
CU
CO
 
g
CU
 
CQ
X
>
 
COCU
CO
 
CQ
x:
 co
4J
 
CO
U•HbO0
r-4HP
'
OCOp0•H4_)0CU
 
•
c
 *o
p
 co
O
 TJ
0
 
Pr-l
to
 o
cu
 
X
£
 
cu
0{X
 
>i
r-l
>
 
-H
10
 to
+
 C
Oto
to
 
+J
.0
 
-r-l
,Q
T3
 
rl
P
 C
O
PO
 
C
O
r<
 
CO
bO
 
^
>4-l
 
CO
0
 C
Uo
CO
 
«H
CO
 
>
0
 C
O
rJ
 
-0
•
cutoPbO
470
1Hardware (schematic) andsoftware (code) design *
description data
o Experience
o Analysis groundrules
e Assumptions
• Failure mode data
• Failure rate data
• Assumptions
• . System operating
conditions
• Reliability model state
definitions
e Analytical definition
for coverage parameter
from reliability model
Identify and define the
hardware dependency structure
for the stage of interest.
Construct a fault table which
defines each fault to be con-
sidered within the dependency
structure boundaries.
Assign failure rates to every
fault entry in the table.
Systematically evaluate and
categorize each fault accord-
ing to its effect under
specified circumstances.
i
Add up the total failure rates
in each failure effect cate-
gory and compute the value of
the desired coverage parameter.
Result: A point estimate for
coverage, and, not
insignificantly, a
comprehensive review
of the fault tolerant
design of the stage.
Outline of the Failure Modes and Effects Analysis (FMEA) Method
Figure ,G-20
471
•oCUCO
 
J-l
•H
 
r-l
 C
U
*J
 
-H
 JJ
CO
 
CO
 
CO
W
 
fa
 
C
£
Hc0aoCOcuQCUo2cuI-l•HCOCU
 
O
O
 
-U
C
 CO
CU
 
CU
 
C
O
 
P
 bO
•r4
 
CU
 
-H
>
 
M
-l
 W
CU
 
CU
 
CU
Q
 p
i
 Q
CU
 
C
to
 
o
P
 
-H
t-l
 
4
J
i-l
 
C
O
CO
 
O
fa
 
-H
g
 
4
J
 
-H
CU
 
O
 
CO
+j
 
a
»
 co
co
 m
 
co
^
^
.
 I
 '
 
'
 
-
 t
r
*
 
*
T
^
 I
 
'
 1
co
 w
 
u
COfaCNfa
r-l
faOfa
•
'
•
.!
cur-lCO
HCOfaO3coCOo
fa
3\O!bO•I-lfa
cuCOEofa
oincu
coo
T)aCO
'472
• Hardware (schematic) and
software (code) design
description data
e Experience
« Analysis groundrules
e Assumptions
e Failure mode data
e Random number
generator
I
« System operating
conditions
e Reliability model state
definitions
Analytical definition
for coverage parameter
from reliability model
Statistical model for
sample data
Identify and define the
hardware dependency structure
for the stage of interest.
Construct a fault table which
defines each fault to be con-
sidered within the dependency
structure boundaries.
Enumerate the faults and select
a random sample of appropriate
size.
Manually insert each sample
fault into the hardware and
categorize it according to its
effect under specified circum-
stances.
Using the sample results
compute a lower limit for the
desired coverage parameter at
a selected confidence level.
Result: An interval estimate
for coverage with a
selected confidence, and
a test sample evaluation
of actual hardware
performance.
Outline of Statistical Estimation Method
Figure 1G-22 473
•o
*O
 
QJ
QJ
 
CO
i-l
 
C
O
O
 i-l
 
CO
CO
 
>
 
«0
 P
 
0
-
fa
 
J-i
 
fa
 
S3
 
>-
.
QJ
 
<J
 
PQ
CO
 HO
 
H
53
 
053
11QJ
 
*O
r-l
 
Q
J
O
 
-H
 
CO
CM
 
>
 
C
O
 
P
 C
O
fa
 
M
 
fa
 
J3
 CO
QJ
 
<J
 
CL
co
 H
 
F
^
Q
 
PQ
!3
 
.
-oQJCO
O
 
Q)
 
CO
r-l
 
f>
 r-l
 
P
 
O
-
fa
 
,4
 
T-4
 S3
 
!>"•
QJ
 
CO
 
•<
 
PQ
CO
 
fa
HOa13
T3
 
QJ
O
 
Q
J
 
to
O
 
>
 
r-4
 Q
 
C
O
fa
 
r-l
 
*H
 
53
 C
O
QJ
 
CO
 <Q
 
Q
.
CO
 
fa
 
>
^
PQ
CO•
f\
^
CO
zCM
•
XCN
53r—
l
^
r-4
S30-^/
'
 V053/
4J
 
/
O
 
C
O
 
/
QJ
 
Q
J
 
/
ll
 
1
 
J
 
/
*T
^
 
F
*
 
/
W
O
 
/
t
o
O
 /
QJ
 
Q
J
 
/
r
«
 
4
J
 
/
P
 
C
O
 
/
r-4
 
O
 
/
•H
 
/
CO
 
/
fa
 
/
 
Q
J
 
C
/
 
M
 O
/
 
C
O
 
-H
/
 
£
 
•*-*
/
 
*o
 o
/
 
C
O
 
p
/
 
S
C
 fa
C^Or-~•
CMmCMCOONx^
-
oo
•
r-l
m»d
"
CNr-l
COoVDV^OCO•
CMr-l
r»
»
CM^^
CM<£^N\«
 s
00
•H'Ai—
 Ia
•H4J
r
-l^
^OH
00mCM•
ONSJ^oor-l
VOO00
•
r^vOmiivOr^
.
vO
•
VOr-l
0CNCNs~\
vD3x^i^QJ4J}_lQJ£OON^,,^P
COoo«3
"•
f-t
COVOCOCMf*-
.•
ONOOCNr-l11
inCMo•
VOs~^
r^*>x«/
}_(QJ4JJ^QJp^O
Op*»
v
^
<
m^CO•
mCOCOt^»o•r-4ONCN'1
 
•
00CM
'
O
•
^to}_lQJC0
•H4J•HTJCO
 
^
~
N
C_J
 
00
<J
I
—
 <
 
^<
CO
 
' <
'^
coO•HCOCJP
vOr>»
CM•
o,_<ONr-l
r
-l•
CMr-l1|
vOO0
•
r-lO^
4J
•i-lC2toO
0*o<-;O4JCO
CM00<}••
COvOO^
o
^
0ON•
r-l
CO00ONvOOmONr-l
O
•
mVOoCOo•rlcoJ_f4Jo0)
I-lWo£QJCO
r-l
»J
-
ON•
 
•
r
-l
i-l
CNCNOO
•
mCOoom«j
-
oo0
•
r<»
.
COONO^4-1COP4JO0E0)CO
•vf
COm•
ONi-l
^
>
r-4r-l
OCO
*
ooCMvOmmooVOmi-ioor-l.
r
-l
VOr-^
V^O.
.
CO<5«C_|OH
COa>CT>
COpo•CCO•r4r-4r-l•HECO0)pI-l•riCOM-l\~sCOininONOr-4QJ4JCOptjQJj_iPr-l••-1COfar-lCOJ_J*^fOH
CMOONr-lIICO
4-13COfaI-lCO4-10H
ONr-l
v
j
ON00
ONr-l
OOON
II
.CN
/V
^
*
 
X
+O
A
CM
<x+r-l+O
f
^
'
r-x
00ON
•
II
COf-
.
i-l
r-t
oooor-4r-4
II
P
J
&
'
+Os
CN
&+
i
f"H
53+02
QJCOCOOpO
,
 
J
*rf
4J
•HCOCCOrlH>iQJi
r
^1•rl
COO4-11<^u
r~
^
aPPV4oU-l<faIW0COII+JI-lpCOQJPi
fco
 
1
CN
 I
1
 
1
QJ3toO
•Hfa
«w
 
cu
 P
O
 
i-4
 
O
P
 
,0
 
-H
6*S
 i-l
 
C
O
 
4-1
H
 0
3
r-4
 
C
O
 
r-l
CO
 
4J
 
4->
 
P
P
 
r-4
 
r-4
 C
X
4J
 
P
 
P
 
0
O
 
CO
 CO
 
C
u
<
 fn
 C
u
 ^
CO
r-l
 CU
P
 
-1
CO
 
Cu
tr
^
 
CCO
<4-l
 CO
O6*2
 
-H
COFnCM
p4r-l
F
n
•
 
'
 
-
 
-
 "
 "V
 ••
 -
 
:
 
-
••'
-
 
-
-
-''?-'
.
 :i
.
 
'
 
'
.
'
'
.
'
"•'"£
•
I
4J
 
/
O
 
C
O
 
/
CU
 
CU
 
/
IH
 
*
r4
 
/
*W
 
V
 
/
w
o
 
/
t»0
 
/
cu
 
<u
 
/
t^
 
jj
 
/
J3
 
03
 
/
r-4
 U
 /
•i-l
 
/
 
.
03
 
/
fn
 
//
/
/
 
<
U
 
p
/
 
>
-l
 0
/
 
CD
 1-4
/
 
Ss
 '
 4J
/
 
*o
 cj
/
 
r-l
 p
/
 
C
O
 
P
'
 
x
 fa
gfg
m
•
«
-^
<}•6-S•^d*•o<J-C^M«d"i— 1ICMCMs~^CM^C•*^r00T-lC^U(X•H4Jr-lpSoM
8-2
CM«
-^
CMg«5
CO•
VOCMr^-4r~»ioCMx-xVO^c>/^}_lCU4-1V4CUp^o
<*»
^
Q
g-5
^}
-•
Oi—
4
S-2
CO
•
ooCMoor-l
CM
^
^
f^
.
<£^S
\^
s
J
^
CU£cu>p0o
.
 
Q<
6*2
VO
•
CO6^2
00
•
in^«^
~iT-l
ooX\^sCOL^cup0•H4-1•r-l•oP00r-lCOPoO•HCOOQ
"
 6-2
CNr-411111j_lO
•HPoO0T30CO&
6
^
r-l
VOr-l
6*2
CM
•
CT>
r-l
cr
.
mr-l
mr-iC iOCO4JU
^
*
<-3COU•Hpj0r*4JCJCU
r-lOBcu
6*2
O0Or-46*2
0
•
Oor-l
VOmooCOCMo..
CO4HoH
•
1
 ICO
 
*O
P£CO CU
-
 
CU
 
-H
 
CO
*J
 
p
P
 
C
U
 
-H
4->
 
r-l
a
_
J
 
^
 
^
^
4J
 T
3
 
C
U
 
E
 
r-l
CO
 
CO
 CU
 
CQ
 
CO
 
r-l
4J
 
C
 
}-l
 
T-4
 
CO
 
CO
r-l
 
O
 
-H
 
C
O
 
P
P
 
O
 
C
O
 
VO
 4J
co
 
cu
 cu
 
m
 
o
*H
 &
 
T
J
 
r-4
 
r-l
 C
O
O
 
fv
o
 
-H
 cu
 e
O
 
X
!
 
X
3
 
C
O
r-4
 
^
 
4-1
 
C
O
1
 
,
'
O•H4J^1CUCOpr-l4Ji— 1COfeCUr-lO-eCOCOeoTDPCOP^Eo£COCOo
cuCOCOCJp0•H4J•HCOpCOHXCUi—40,.§CO104-)1CUr-lO-°O(4-4
[a|qcu360•HPn
rH(0><1)4-1C0>oC<u
TJ•H
M
-lC0
CJ
OvO
i
i-l«cu4JC0)0C<u"O•I-l
M
-l£3O
0f"
^,
r-4(0>CD4JC0>OC<uTJ•HM-lp;O
°OCT>i
k
r-ltoO^4-»C0)CJC<uT3•i-iM-lCoumi
i-iCO£<u4-1C<uoc:<u13•r4M-lC00^^\o^
4-1
caDr-l(00)M(0toVouQ)5-13T-l1-1toT)CooVC/)60(0COocuCO>-loCO4JcuCOto0)O
10W
j
LCL0)>-l3M•i-l
Conclusions
A practical method for evaluating fault coverage has
been developed and demonstrated. Estimation of fault
coverage at a 90% confidence level is consistent with the
best failure rate data available today. Estimation of
very high coverages (greater than .97) is not easily
done with a high confidence, without a very large sample
size. Because of the .99 multiplying factor used in
the development, coverages estimated using the described
method must necessarily be less than .99.
477
References
1. "ARCS Hardware Description", ACS 10936, General
Electric Company, Binghamton, New York, December 1975.
2. "Microcircuit Device Malfunction Report", MDMR-0275,
Reliability Analysis Center, RADC, Griffiss AFB,
Rome, New York, March 1975.
3. "Reliability Engineering", ARINC Research Corporation,
W. H. Von Alven - Editor, Prentice-Hall, Englewood
Cliffs, New Jersey, 1964.
4. A. Hald, "Statistical Theory with Engineering Aoolications",
John Wiley & Sons, Inc., New York, 1952.
478
APPENDIX H
RELIABILITY RESULTS DETAILS/PARAMETERS
1.0 SCOPE
This appendix gives an account of the more detailed analysis performed
to establish the various Markov model parameters. Sources ftSr these
parameters are identified in Table H-'l.
2.0 PERMANENT FAILURE RATES
Permanent failure rates for the WWCS and ARCS computers were predicted
by General Electric based on MIL-HDBK-217B. To make a comparison
between the WWCS and the ARCS meaningful, the failure rates (previously
submitted by GE) for the WWCS, which were based on a different source,
had to be adjusted.
Permanent failure rates for sensors and servos were based on airline main-
tenance data accumulated by Boeing.
Additional details may be found in Table H-2.
3.0 TRANSIENT FAILURE RATESMND LEAKAGES
Four different transient fault effects were considered in the study:
1) Electrical Power Transients
2) Lightning
3) Hydraulic Pressure Transients
4) Sensor Nuisance Failures
These different sources are discussed below with the emphasis on Sensor
Nuisance Failures which in the past has been the dominating nuisance failure
source.
3.1 ELECTRICAL POWER SYSTEM TRANSIENTS
Electrical power system transients may be classified as being either normal
or abnormal. Normal transients are transients which can be expected during
normal operation of the airplane with the equipment operating normally.
\r *•""--.'
• s> ~ ' f
i' : -H
tOtoto
h->
 
a
:
•
 
UJ
Q
;
 
|
—
O
 
U
J
a
 02
LLJ
 
<
N
 C
U
P
 
Q
'
^
3
 
O
.
COUJ
 
>
'
O
 C
3
CC
 
^
=3
 
cc:
o
 <
c
,G
O
 
S
I
U
J_JCQ
LULUO
LU
 H
-
.
l-
l
-
<
 <
ct:
 o
:
U
J
i—
i
</)
 
1
—
 CO
2
 
_
J
 L
U
<
 
=
>
 I
-
CC
 
«
I
—
 u
.
 
a:
U
J
to
a
:
 
=
»
 I
-
LU
 <
 
<
a
.
 u
.
 a
:
x
X
X
CQI— i—I
C
N
J^-x
I
 
£T
^
 U
J
C
Q
t
-
Q
 D
1C
 Q
-
I
Oc_>
too
L
L
J
>
c_>
 
or
Z
 U
J
<
 C
O
I-to
z
 c
r
•
—
 
o
<
 to
s
r
zUJ
UJ
 CO
-l<
C
£
\
-
-
-<
<
Q
CO
 CO
•
—
 o:
to
 o
>
-
 C
O
_
I
2
<
 U
J
z
 to
COoorLU
_J
 C
O
O
_
x
o—
•13
l-
z
to
 •—
•
"
 
_l
I-Q
-
<
.-SL
l-
<
CO
 CO
LUoz
 
z
Q
 
L
U
 
L
U
 O
<
c
eLU
tU
 Q
-
Z
 X
i—
i
 UJ
LUO
 L
U
OQ
 CD
rv
*
 c
^
LU
 O
i-HCM*:pagi
COLUh
-
<
C
£
o
*
 
-l<
LU
 
U
_
cq
 
H
-
<
 
za
:
LUQ_
acoLUacoCOCOLUoocra_acLUQ_s:o
•
"
t°acot—CJ<u_>-h-M^_l3^(31—ZLUZo
COO£
O1
-
<
:
oi—
 «
COUJQ>-^^_|<aLUQ
.
>
-
h
-
1
-
QC<Q_
C
J
C
xi1
CQ
*f
^
\
/i
—
 )
 A
1
 
|
\C
Q
/
x
CQ*
.
<COCJ»—
 4
zoo
:
1
-
C
JLU
_
l
ii
 i
OQCCJI—IS
s:00
/^
"
\
f
 z
>
l^J
V
 
J
 /
%
X1
-
z<"3XH
-
Z<-3COoco0=DQZ0CJi—
 i
s:LUCOLJJh-n
 i
crCJCOi—
 t
Q
CO
CO
^
 
>
.
Q
-
 
Q
C
N
,
 
V
s:
 
Q
-
^
^
v
 
1
1
1
 
A
^
^
 1
x
^
v
 
x-™
x
QC
 
QC
LU
 
LU
^
^
^
 
^
^
^
>
-
 
>
-
I
-
 
h
-
_
l
 
_
l
PQ
 
CQ
•<
 
<C
_
l
 
_
l
UJ
 
UJ
QC
 
QC
Q
 
Q
UJ
 
LU
X
 
X
CO
 
CO
CO
 1-1
 
*
-•
QC
 
-J
 
C
O
 
_J
O
 
CQ
 
Q
C
 CQ
t
-
 <
 
O
<
_
 (
_
 
|
_
 (
_
O
 
CO
 
CO
 CO
<
;
 
LU
 
i
—
 i
 LU
Q
-
 
CO
^
 
"
 
'
O
 
Q
C
ocLUi—1S^
/
O
I
 
U
J
Q
_
co1
1
 
-«
\
 ^
\5
:
\
^
•v
ocLUQ
_
Q
_
D>
-
CQQUJDCLU>0CJCOl-<ac<a.acLUXo_j_j< v\\1////COzoH<cCJu_•HCJLUQ.CO^-xQCLU1ZOwXQCUJOO
oh
-
CJ3*— <
QQCOCJCJ<CJ?z*—
 •
1
-
coLJJh->-CQQLU>UJ»—
 1
C
J
<UJCQ>-<s:
i
—
 iiCQ_JLU>LU_J>-—_l<rsa
*
PQCOCO<_-
 J
C
JX
.=3
-
c~>
cr>
LHQ0Xh
-
LUSI%
hOCOCO1QCO1_l1—
 1
S
QLU1 —•—
 i
CQ<XZLUZQC0CQQC•—
 «
<V
 
"
J
—
FQCOCJ<LU—
 1
<1
-
zLUs:zoa:i—
 i
>^U
J
<QLUOh
-
zCOoacLUaCOacoCOzLUCO
sCxj
NALNAT
i
I
-
 
h
O
 
-J
 L
O
<
 
c
r
 
<
 c
xi
o
 C
D
 
>
 i_
n
o
 
^
a
_
 CD
 i_n
 
s:
 tn
 LT\
o
 
o
o
 r^
 
LU
 
u
n
 
r^
x
 
oo
 
*\
 
QC
 r—
i
 
*>
^
COLUQC01LUI-co<LUQ
I
I
CD
z
 
r
-
^
QC
 \
UJ
 i—I
HCO
 
•
-
<U
J
 
QO
CO
 
QC
QC
 
UJ
•
=>
 
Q
-
O
UJz:O-
 
Q
.
—
.
 
UJ
D
 
I
—
a
 
—
•
LU
 
Z
COZ
COac
xi
-
X
a:<QzLJJ
U
-
 
O
3
 
X
ICXIQOCdLUa_a:
h
-
 
Q
x
 
z
o
 L
U
o
 
Q
 
u
.
 o
UJi
481
m3.1 .... (cont'd)
Abnormal transients are those that do not necessarily occur irithe.normal opera-
tion of the airplane. They are usually the result of equipment malfunction
or failure or the use of abnormal operating procedures.
A few of the commonly encountered normal transient sources are listed
in Table H-3 below... Table H-4 displays a few .abnormal transiejit? sources.
The transient data presented shows that normal transient faults will have a
duration in the order of 50 ms. The rate of occurrence is estimated not to
1
•
exceec1 [20]timesj
become | necessary
such that Ml '
per
to
[hour. |
jdesign
lormal
With
the
this high occurrence
hardware,
transients will
for
H
example
absorbed
rate/ a
DC - power
without
will
supplies
degradationj
of the system operation. This approach, which agrees with common avionic
system requirements, implies that the leakages associated with normal power
transients are zero; only a design error would result in a nonzero leakage.
The most critical abnormal transients are short circuits caused for example
by a failed module. Depending on the isolation mechanism, i. e. whether
electronic or thermal, it may take several seconds to clear these faults —
a period during which the power supply voltage could drop essentially to
zero. It is estimated that these abnormal transients will not occur more
—2
often than once per one hundred hours, i. e. a fault rate equal to 10 /hour.
It will be assumed for the ARCS, that each channel is powered from a
separate power bus where a channel includes the sensor, computer and servo
stages. It will further be assumed that each sensor signal will always
recover following a power loss, recognizing that this recovery may include
warmup and filter settling time. A servo stage will also be intact following
a power loss but will always recover in the servo off state.
Regarding the computer stage, reliance will be put on the ARCS ability to auto-
matically start up and resynchronize following a power loss. This will also
apply to the remote situation of a simultaneous power loss in two channels.
482!
tCOLU_l
a
.
2
:
X_IJ«irffff\zLUCOZ<cchrccLU^p
 
'
0r\_i
_
_i<cco
s
rN^
3
:
LU_l
OQ«i
1
 
—
H(_)LUnu_LUzo1r^ce:3QLUCJCCIDOCO
Z0
CO
 
•
—
=>
 h
-
OQ
 
<
&
2
 C
C
C
D
 
<
i
—
 1
 
>
+
LU
O
 C
D
1
-
 
<1
 
—
B-^:
 
_i
0
 0
fs\>
iCO2:
C
D
LT
>
CDZ
.
•
—
 <
XC
J
11•
—
 «
2
:
COQ0_l1
C
J
<
LU_l
f
\
LJL
.
Q
_
»
—
 1
CC0-1CL>
C
D
C
D
LT
>
CO2!
C
D
LT\
CDZ•—
 •
31CJ
1
-
C^OLU>1—
 1
1P
^
C
J
:DQZ1
—
 I
Q
_
0ccQLUCD<I—_JO>
&-^
cz>
C
O1072:
C
D
LOCC1 1
 1
u
_
COz<cc1—LUCJCC—
 1
^
_
^
oCOccLU•^~^OCLCJ<
Q
-
OCCQLUCD<1-_l
O00^JX
~
^
V
OQ
 
>
r
 oo
O
O
vl
Q
-^
'
>CNJ1
I
—
 1111CDZ.XCJ1f^2COQ0_J1CJQ
#s^
^
COLUI
_jQ-s<X
LLJ-^^!
COMZLUCO<^CC
1
 
—
1ccLU2:0
D
_
_l<CC0zPQ
<ccri1CLUPQ
^^
~
|
—
CJLULL
.
U
_
LUZ0t—
 t
1
—
<cc3QLUCJCC=>OCO
COCO0_lccLU2:OCL>-CC<ccoQ_2:LUI—CJUJCOLD1CO2:CDLOi — 1
1
-
r>CJcc1—
 1
CJ1-cc0ICO
z01
-
 1
-
<
—
 
•
1
—
 ID
-
~
 CJ
CJ
 CC
X
-
.
^^
LU
 CJ
u
_
 z
o
 ujQ.
CO
 
O
COo
 c
c
-
J
O
oLUCO
•zr
1
 1
 i
LLJ
CD
 CC
z
.
 r>
—
 
_
l
 Z
1
-
 
-
-
 0
<
 <
 
_
CC
 U
_
 1
—
LU
 
CJ
Z
 h
-
 Z
LU
 Z
 Z
3
CD
 LU
 U
_
21
 
-I
CC
 C
L
 <
LU
 
—
 2
!
2
 Z
)
0
 
<
3
 C
C
CL
 L
U
 O
D
-
0CCQLUCD<l_1^_JO>LUCDCC<_JCO2:CDCDi — 1CCo1—<cc1 1 1LLJ•zLUCDLU
CO<
 Z
31
 0
CL
 
•
—
1
 
h
-
U
-
 CJ
O
 L
U
1
 
Z
1
—
 
-z
ra
 o
0
 0
U
J
I
—
 «
COz<ce:
U
J
I
-
co>
-
COccoCu
enoCJCOCOz.<
cc<CO
C
J
 
Z
—
 •
 
o
ce:
 
cj
o
<
 (
-
h
-
 C
Q
 21
U
_
<
 Z
 
-
ce:
 >
-
 z
cj
 
_
l
 o
ce:
 ce:
 •
—
_
 
<
>
<
 2
:
 <
occ:
QLUl-CJxLU
•3/JL.S.l (cont'd)
(This event has sufficiently low probability that the ARCS system would be
acceptable even without this capability. It is however "for free" since
it is required as a part of ARCS initial automatic start up function.:)
If the restart function is properly designed, the system would, at least in
principle^be able to recover from all possible transient power loss events.
In the ARCS trade and reliability study it is therefore assumed that the
computer transient power fault leakage is zero (nominally).
3.2 LlGHTNINGTSTROKES
There are generally two separate time phases of a lightning stroke (figure H-l).
One short initial phase with a duration in the order of a few tens of micro-
seconds during which the current may reach a value of 2 • 10 A^and a
second phase with a duration up to a second and current levels in the order
of a few hundred amperes. This latter phase is the more critical from
structural damage viewpoint while the short initial phase may give rise to
potentially harmful electromagnetic interference. Experience and tests
s h o w tha t l i g h t n i n g i n d u c e d p o w e r b u s t r a n s i e n t s u s u a l l y
wi l l be of a m a g n i t u d e not e x c e e d i n g 6 0 0 ' v o l t s - p'e'a'kr-tb'-pe'ak
and a duration in the order of 50 p s (figure H-2 ). Lightning induced
t ransients.therefore, have an effect similar to transients caused by'inductive
switcM'ngg \w3jlih the difference that they may occur anywhere in the system,
in particular in longer wire connections as for example to the sensors.
The effect of these transients on analog hardware may be suppressed
by proper design methodology. The effect on digital hardware, for example
whether an induced transient could scramble the memory, is more difficult
to assess but experience indicates that this possibility appears remote. Air-
crafts equipped with digital Inertial Navigation Systems have, for example, on
several occasions been subjected to lightning strokes without any noticable
effect on digital system operation.
- Si'nce it has not been~Ts~how.n th.at:. l i g h t n i n g has a s i g n i f i c a n t
effect on digital system operation, the transient leakage due to lightning will,
I nominally, be set equal to zero in the ARCS reliability analysis.
LU
COLU
«=c
CDCD
.
LUXcbo:K
GOcbCDoU!LLJUI—4
-
I
-
cocb
cos:
LOCOCDCDCO3
.
CSI
LUct:ZJ
LUa:
o
:
^o
cnoCJ»—
•
COz<ct:
LL
.
s
 
<
CO
 CO
LU
 Ll
_
-
^
 
^
^
CO
 
O
z
 o
cr:
 
-
oc
 
•
_
LU
 
<
s:
 s:
oQ
-
 
-
Z
-I
 O
C
J
 
>
o:\-CJ
 
_J
LU
 
_
l
LU
 OQ
C
t
 
Z
C
J
 
>
-
CD
CQLUO•DaLUorI-CJLUCJ•—1I-coI—Ict:LUh-oce
CNIi
oroCD
O
485
3.3 HYDRAULIC POWER TRANSIENTS
The most common hydraulic pressure transients are shock waves caused by
sudden valve closure and pressure droops caused by a sudden massive demand.
However, these effects should, according to Boeing expertise, be manage-
able by proper worst case design so that hydraulic power system pressure
variations will always stay within ±50% of the nominal line pressure. Thus,
if high and low limit pressure switches were set at those values, transient
free operation would result.
For the ARCS trade study and reliability evaluation, it is therefore assumed
that transient hydraulic power faults nominally have zero rates.
3..4 SENSOR NUISANCE FAULTS
Sensor nuisance faults are usually caused by temporary disagreements between
redundant sensor signals. These sensor disagreements could be caused
by "static" effects like bias or scalefactor differences or they could be
dynamic in nature, stemming from different sensor dynamics or differing
inputs. Examples of dynamic disagreements caused by deviating inputs are
accelerometers sensing different accelerations due to mounting structure
vibrations or air data signals sensing the air pressure in different pitot tubes.
Static deviations and dynamic deviations caused by unmatched dynamic sensor
responses may, at least in theory, be eliminated by improved sensor design
or compensation algorithms. One such algorithm developed by Boeing is
the Compensated Limited Average (CLA) algorithm (see reference H-l )
which compensates for bias deviations between analog sensor signals. Other,
more sophisticated algorithms^ which perform bias as well as scalefactor
compensation, have been designed and presently are under evaluation by the
Flight Controls Technology Staff at Boeing.
3.4 (cont'd)
) By substracting off the static effects from the sensor difference signal, a
compensated (or equalized) signal results, which more closely represents
the dynamic part of the signal deviation. This dynamic error signal may be
considered random, exhibiting certain statistical properties like variance
and spectrum. The random signal is monitored by a dynamic failure detection
mechanism which .ideally should be able to distinguish between sensor dis-
agreements due to randomness and "real" sensor failures. The detection
mechanism used by ARCS consists of two fixed thresholds, one positive and
one negative. A fault is detected whenever a difference signal exceeds the
preset thresholds.
3.4.1 Faults a Triplex or Higher Redundancy
The ARCS strategy for sensor faults occurring at triplex (quadruplex) redun-
dancy level depicted in figure H-3 consists of reconfiguring from triplex
(quad) to duplex (triplex) immediately upon a threshold exceedance. The
) deviating signal will not be used unless it recovers at some later time
within the threshold band and remains within this band a prescribed time
period T-^.
~ The recovery of a previously faulty signal may take place at any time after
the occurrence of the fault until a time when a second like fault occurs. A
second fault in a like sensor will immediately cause a first fault to be declared
permanent. As a consequence, permanent hardover failures will not be
classified as such until a second fault occurs.
The strategy described will exhibit outstanding nuisance fault rejection
capability. However, there are two situations which have to be analyzed.
The first is the possibility that a second fault occurs before the first fault
has time to recover, i. e., before the end of the T, period. This would lead
to the first fault being declared permanent. A transient leakage would then
*~\ take place in case the first fault was caused by a transient condition.
.,487
#•"& ' /
o_J_J
o
—
•
Q
_
J
LU
 
<
Q
C
3
-IH
-
C
C
LU
O
Q
Q
_
 
.
2
IL
U
L
U
X
to
 
-
IU
J
,O
 to
<
Q
.<LU1
—
 
-
-
 
1
£j
O
L
U
_
J|
-
 
-
^
U
-iLU
 <
 
C
O
 
—
•
,
 Z
:CC
 
>
•
 
-
 LLL—
-I
 >
-i
'CD
 o;
 i
—
 ct^
"'
.
<
 LU
 z
 ID
 
.ji
—
to
 i-LU
_|c::_j
I
—
 z
z
^
-L
U
r
,
LU
,Q
 
LU
 i
O
 I-l
;
 LL
 H
-
 LU
:
 
S
o
:
'>
-rf
;
.
 O
:<Q
o
,;
,
 J
to
(_>(_>!
LU
 to
 
:
S
 
r
 to
^
 
C
O
I
 
•
*
*
 Z
,
_
 to
•
•
•
•
 
•
a
: <
l
 
-
 I
-
 ^
—
|t
—
 t?
 
t
Q
TlLU
 
'itl-L
U
"—
!
iiflh
-1
3
 L
C
 O
 X
 L
U
>
(<
Z
 
O
il
—
_i«-l-o
_
 
:
 to
;.
 
zzzz
 
>-<
 
-•
LU
 IT
.
 LU
 to
 CQ
 X
 
h
-
S
 
<
 
•
—
 L
U
 
'
 
"
 Z
 
.
,
 
O
S
Z
tO
Q
O
 r-H
U
:
r-,to
 L
U
 Z
 
•
—
•
 
"I
—
 
•
-
«
" J
'
 
;
 tz
.
 <
>
 
z
 
:
 to
x
 
a
.
 c
e
 o
r^
-l
-
 
H
 
.
 
L
U
=3
 —
 
:
 to
>
 H
-
£?
 
z
o
:e)LucQ
>
=
3Q
L
U
--L
U
Q
 
•
^
IQ
 
—
•
 
"£
.
 
•
 <
 
to
ItO
 L
U
 LU
 X
 (
-
 
"
"
>QCO
J<cLU
ozQQLUQCZLUCLCC
.
(DLUo;h-t-zLUsLUuZ
'
<IQ
.
 QLUDC0COGOILU
-
CK:IDIS
488
3.44.31 S, 4. l(cont'd)
The second situation to be analyzed is the possibility and consequence of a
latent failure caused by a false recovery. A signal, for example a rate
gyro output, failing to zero could falsely recover afas'omei later; timeud o"
w he"~n .ffifchteoig-'OTo'd Jslibgria'bsjfcavE edgykojs ertt al(zse;rvO :zer{_A_ .au=b.Tsse;q;u.e.n14
failure to zero in a redundant rate gyro could cause an undetected system
failure. Both considerations above will influence the selection of TI , the
first in the direction of a shorter and the second in the direction of a longer
T period.
The first fault leakage, 2, , will be given by the probability of a second fault
occurring in the T interval and may be expressed by:
.*= N; ( xp + X T ) ' Tx
with N = 2 or N = 3 for a triplex and a quad stage respectively and with A
the permanent and AT the transient fault rate.
The situation of latent failures and the possibility of false recoveries, i. e.,
recovery of a failed signal, will now be addressed.
A latent failure occurs when a signal fails passively in a way such that it is
undetected by available signal monitors. Usually a latent failure will be
detected at some later time. A sensor signal failing passively will for
example be detected when the good signals are caused to deviate significantly
from the passively failed signal. On the other hand, as discussed above, a
detected passive failure may at some later time falsely recover. These two
situations may be modelled by the Markov representation of figure" H-4. v
•j of figure is partitioned into two statesa 3a and 3fo,
State 1 in the diagram is the unfailed state, state 2 the latent failure state,
state 3a the detected but not isolated passive failure state and state 3b the
detected and isolated failure state, f and f are fractions representing1 &
the conditional probability of an undetected failure and the conditional prob-
ability of a detected but not isolated failure, given a failure.
489
3 A'd-frf2)
FIGURE
490
3.4.1 (cont'd)
A is the rate of detecting a latent failure and A the rate of false
recovery of a passively failed signal. A may be assessed by esti-
uO
mating the threshold exceedance rate of the difference between an active
signal and a passively failed signal. The rate \
 Q0 may be assessedO4
as follows:
Let \ = A be the latent failure detection rate and assume quiescent
flight conditions with occasional exceedances as depicted in figure H-5.
Let the time between two exceedances be t,. A passively failed signal
will falsely recover if t > T where T is the SSFD recovery time of
figure H-3.
THRESHOLD EXCEEDANCES
L- ACTIVE
SIGNAL
PASSIVELY
FAILED
SIGNAL
FIGURE H-5 LATENT FAILURE SITUATION
491
3.4.1 (cont'd)
The probability of no threshold exceedance in the time interval [0, T ],
i. e., the probability of a latent recovery is, assuming a Poisson
distribution:
p (latent recovery ) = exp ( - A T )
Furthermore, the probability of exactly n exceedances in a time interval
[ 0, t] is given by:
, at1)0
plnumber of exceedances equal to n] = —~—^ exp(- At)L 'J n;
The probability that a latent signal will not falsely recover in the time interval
[ 0, t ] is then given by:
CA / \ *\^
p ( no false recovery ) = £ -•—p— exp(- At) • [l-exp(-X T )J =
n=0
= exp [ A t - (l-exp(- X. T X ) ) • t] • exp(- A't) =
= exp (- X O exp (- X Tx) • t) ^
This- probability is the solution of
p = - Aj exp (- X Tx) • p ; p (0) = 1
which means that
X32 = A- exp (- A Tx) = A23 exp (- A 23 T^
Since the CARSRA program will only handle unidirectional transitions, the
failure rates X and A have to be condensed into an equivalent rate .A23*.
This can be done as follows: the probability mass flowing from state 3 to 2 in
a small time element dt is
3.4.1 . fcont'd)
dp32 = X 32* P3adt^ * ' *' ' 3 Vl * dt
(It can be shown that p •& 3 AJ: t and
Ocl r 1
P2 ~ 3^2 * where A
 p is the module failure rate)
The incremental mass flowing from state 2 to 3 is
dp23 = A 2 3 * P 2* d t = \ '
so that
dP23 - dp32 = X • f2- 3Apt dt - A,- e" X(T1 . 3Apf t dt =
- A' T ' f
= (A- - A' • e l (~Y-))- 3Apf2t dt =
= A ( 1 - e X Tl ' (-^rt P« dt
The equivalent rate A
 00
 tnen
 becomes
/ - A Ti fl
= A ( l - e ( -
*
which could be positive or negative. In the case of a negative A , the
• -£o
appropriate model will be using an equivalent rate A
 q? derived as follows:
- A r f iA: 1
 n 1
* = A (e"^T l f 1 - f 2 ) ( l - f 2 ) " 1
493
3.4.2 Faults at Duplex Redundancy
The A R C S s t ra tegy fo r h a n d l i n g sensor t r ans i en t f au l t s
w h e n opera t ing in dup lex wil l be different from the strategy
employed when operating at a triplex or quadruplex redundancy level. Upon
a disagreement between two remaining operational sensor signals, the selected
signal will be held, or frozen, at the value it had at the point of the disagree-
ment (see figure H-6 ). This applies to failures occurring in any of the
d i f f e r e n t s tages in the A R C S . The f r o z e n cond i t ion swil l
impa i r the c o n t r o l of the a i r c r a f t to an extent depending on the
flight mode and the location and the duration of the fault. The automatic
landing mode will for example be more critical than a cruise mode (CWS),
and servo command faults more critical than most sensor faults. The maximum
permissible duration, before the failure condition must be resolved, will be
/ determined by several different factors including the particular control law
implementation and the dynamic characteristics of the controlled aircraft,
and will eventually, in a practical implementation, have to be determined by
closed loop simulation. However, performing a closed loop simulation is a
major task which falls outside the scope of the ARCS program. The ARCS
reliability and trade study therefore used already available data from a
study performed on the 747 automatic landing system (reference H-2).
In this study, which was performed to support the certification of the 747
autoland system, the aircraft trajectory responses to a number of fault
conditions were studied using a closed loop simulation. A failure
494
LU
o:
 
LL
.
LU(-<a;(-co
<£1-cot—
 1
LUX1—1
-
"*•
2LUNOo;LL
.
COi—
 §I2^OCO1-Q.h-oQLUHLU|LUCOLUXH-
LUo:0Q
.
ID>
-1
LU^ _QLUSI
-
^
~
.
 1
—
 1
QLUo:
LU>00LUOLCO1—
 1
Q2<c1
-
2LUs:LULUo:ts<jCOt—
 ,
Q
21—
4
XInS:toa:• — ^uoCO1—
 1
orI—QLUQ•—
 i.^
Oa
:
Q
_
Q
i
0XCOLUCCXJ—LUX1—2>—
 I
X1
-
*
—
 i
S
1
-
-
 
2•—
 •
1O
J
1
 
—
LUX^1—
 1
X1
-
•
—
 '
^2a
:
ZDh
-
LUa:
1
-
02COLUOQ1
<c
C
N
2
1
 —
 
C
D
_l
 
C
O
<
;
>
 L
U
ce
 
x
LU
 
V
—
1
-
2
 
L
U
^
-1
 
t
—
 1
CO•—
 t
CD^£h
-
coLUX1r^»
.
QLUh
-
<
f
_
l
OCO1—
 t
~yLULUOQH
-
02CO<tXl-1^DL^L.LUX1-Q2<
Q2OOLUCO
1
i
—
 1
1
II
1
CXI
1
 
t
—
-
-
>
-
Q1^
—
CO
G
O
(
 )
fV
^
<cLUXh
-
a
:
oLL
'OLUXLU2LUtoo.
'
Q2QLUoCO
oo
I
 
LU
1
 
a:^
,
 o
3v4.2 , (cont'd)
annunciation delay of less than or equal to one second was found to be acceptable
for all faults considered. One second will therefore conservatively be
selected as being the maximum recovery time for ARCS faults occurring at a
duplex stage redundancy level, regardless of the location of this fault.
Upon a disagreement between two remaining signals, recovery will thus be
attempted up to one second after the occurrence of the fault. If recovery
is not successful in this time period, a passive system failure will be announced
for the near and intermediate application models. A possible strategy for
the far term Fly-By-Wire application could be to randomly select one of the
remaining signals. In the analysis of this system it was however assumed
that the system fails upon an unresolved disagreement between two remaining
signals.
3. 5 ANALYTICAL TOOLS FOR SENSOR NUISANCE FAILURE DETECTION
I.-ri': d r d e r'-.to 'be .able to p r e d i c t r e l i ab i l i ty m o d e l p a r a m e t e r s
like transient fault rates and leakages, analytical expressions.areuneedey'for
threshold exceedance rates as well as the probability distribution of the threshold
exceedance duration. The following formula for the positive or negative threshold
exceedance rate in exceedances per second, assuming a Gaussian process, is
given in reference H-3 .
1 "
1 / X 2 V 2 --Threshold Exceedance Rate = TER = —• ( T—) e 2 X
 0 Per second4 " \ A
 o I.
"d2 r "
where j\'o = r;(O) ; A 2 = ~^Y—
L Jt=
.. , threshold
u = normalized , -. •,
std. dev.
and r(t) is the autocorrelation function. In reference H-4 this formula
has been generalized to non-Gaussian processes under the assumption of
independency between the amplitude and the derivative of the amplitude.
The generalized expression is:
496
3.5 (cont'd)
Where p(- ) is the amplitude probability distribution and ZPS is the zero
crossing rate, i.e. , mean time between zero crossings.
In the case of a second order process, it may be shown that
/ v/2
_ - . | _ - 1 = B = bandwidth in cps.
2
 " \ * o /
Which suggests the formula:
TER = . B
P (o)-
This expression will be used in the following. In reference H-3 it is also
shown that the average time spent above a certain threshold, u, is given by:
. 1 - P(u) P (-u)Average time above u ; TA= TE^' = TER~
u
Where P(u) = j p(u) du
— oo
This expression holds regardless of the symmetric amplitude density p ( « )
A useful approximation when p(u) is the Gaussian distribution is:
( u
T. =
"A B ( 1+u2) V2TT
which holds for J uj > 2.
An exact expression for the probability distribution of the time,spent over a
threshold, appears to be difficult to derive. An approximate expression
which holds for large thresholds assuming a Gaussian process is however
given in reference H-3:
497
3.5 ' (cont'd)
/ + v 2
- JL- ( \
Prob (exceedance time > t ) ^ e 4 \ T^ /
In the development below it will be assumed that this formula is good enough
• to permit the assessment of leakages at duplex redundancy.
A more compact formula for the nuisance failure rate at duplex redundancy
may be obtained by observing that a nuisance failure results,when (a) the
threshold is exceeded and, (b) the signal stays above the threshold too long.
This may be expressed
NFR = Nuisance Failure Rate =
= 4 x TER x exp if- . , _
V 4 V 'TA'
where T is the maximum permissible recovery time in duplex. Substituting
£
the expression for TA:
/ r TER • T21 2\
= 4 X T E R x e x p I- — [ J j
This expression may be maximized with respect to TER so that:
NFR < NFR = 4x
max
This simple expression may be used to conservatively estimate the nuisance
failure rate at duplex, redundancy. Observe that the expression does not
depend on the bandwidth, B, of the signal!
1498
3.6 AVAILABLE SENSOR STATISTICS
Sensor signal statistics have been recorded in several flights with an experi-
mental 737 aircraft (the NASA 515 test vehicle). The recorded data, which
consists of raw digitized signals from triplicated flight control sensors, has
been processed by Boeing to extract information like mean values, variances,
1\ power) (spectra,| [histograms^] [etc!) [for] [signal) [amplitudes] |and/or) [signal] [differ- "j
ences. This statistical data base was used in the ARCS Reliability/Trajde
Study to assess sensor transient fault rates and leakages.
Table H-5 shows standard deviations for a number of sensor signal differ-
ences, i. e., the difference between two like signals. The data is presented
for both quiescent conditions and rough turbulent conditions.
For this study the assumption will be made that the dynamic sensor deviation
signals are Gaussian distributed with a process bandwidth of 1 cps. The
Gaussian assumption is adopted since insufficient information presently is
available about the detailed structures of the tails of the density functions of
interest. The 1 cps bandwidth assumption is consistent with available power
spectral density plots for rough turbulent conditions.
3.7 1C PS SENSOR TRANSIENT FAULT RATE AND LEAKAGE ANALYSIS
The Incremental Control Processor System (ICPS), currently installed
on the NASA 515 experimental aircraft has, in the past, been plagued by
sensor nuisance failures. To test the validity of the developed analysis
tools, the nuisance failure rates for this system was predicted assuming
the SSFD algorithm documented in reference H-5.
Random fault rates will first beiaddressed under the assumption that deter-
ministic errors caused by bias and scalefactor influences have been removed
by proper compensation. The effects of deterministic errors will be addressed
separately later.
The ICPS SSFD design consists of two cascaded first order filters followed
by threshold detectors and timeout counters (figure H-7). Inputs to
499
TABLE H-5 FLIGHT CONTROL SENSOR DIFFERENCE STANDARD DEVIATIONS
SENSOR SIGNAL,
t t
h
h
6
6
4> TH1
<fr
llR/A
GSE
LOG
TDj
1
 COL
WHL
STD, ®pr,
€Uff||iNT
0,035 FPS
1,5 FPS
0,05 DEG
0,01 DEG/S
" ""•'' "* TU
i ,t r ,35 -DEG/S 2
,02 DEG/S
NA
NA
NA
TD'o
NA
NA
STfi, 1EV,
Riilw
0,16 FPS2
6,5 FPS
,24 DEG
.02 DEG/S
TKRE'SM@M|
CO^ J^^ ^
,08 DEG/S
4,1 FT
,004 FT
,02 DEG
,11 LB
,07 LB
•500
OF POOR QUALITY
TS+1
THRESHOLD
COMPARATOR
TIMEOUT
COUNTER
TIMEOUT
COUNTER
FlGBBEvH-7 ICPS' SSFD^'AUGORITHM
501
3.7 ' (cont'd)
these filters are the signal differences between individual sensor signals,
i.e., A-B, A-C, and B-C.
Table H-6 shows the threshold values and delays initially selected for the
1C PS which are documented in reference H-5. „ The numbers of delay
counts have been translated to seconds using 6.144 ms per count.
To be able to use the formulas for threshold exceedance rates and average
time above a threshold, which were derived in section 3. 5, o the sta'ndard
deviations and the bandwidths of the filtered difference signals have to be
estimated. Standard deviations may be assessed by using the formulas:
00
e 2
O
°I2
- / am
O ,00
2
x(0))
/
r i 2|x(oy.)| cto
2 2Where o is the variance of the filter output, o the variance of the
filter input, x( GO) the fourier tit5riaws:f:otf<m of the input signal and G( to) the
filter transfer function with s = j to,. For simplicity it will be assumed
2
that the input spectrum, x(u)) , is flat in an interval [ o, 2TT'»fl with
f = 1 cps and vanishes for higher frequencies.
2With this assumption, 0 = variance in Table H-5 = 2n • l
so that:
: *f
=
 "27" 0 < U ) < 2 T T
= 0 (i) > 2TT
Furthermore for the lag filter:
1G(u>.) = -7j W T-, + 1
oo 2/ f i ^ V °i
'-'I I ~ " d CO ') '
"W-^Tja + i / 2TT
o 1
to-JLULUs:»—i
h
-
Q•z
.
<c
.
totoLUOC
e>xCOCOCOo_CJ«—1toIacLU
LU
>
*
"
itsa
.
•
 p0LUCOCNI
C
N
r
—OLUto
rH
.h
-
CJ
LUCO
rHrH"»"•
'
t
.
 
~
o
:
 
_i
o
 <
to
 
z
LU
 
—
 i
to
 co
,CD
-
•sr
fA-CDto~
rH
CNItoQLLU
.
CNf
CD
LACNI
CDCDj^
-
i
—
 1•
•
HU
_
:
 ip.^
.
^J
" _
CD
•
^
 
.
C
D
 
C
D
 
-
-CD
 
;,O0
 
.O
Q
*5J
"
 
'^SJ
"
 
"jsSJ
"
 
*O3&
 
^Q
^
K
>
 
fA
 
1
*^
 
'O
 
>'<3
C
D
 
C
3
 
~
0
 
%
cS
 
'
 
'<
£j^-
f^A
 
f^A
 
f sA
 
tO
 
rH
C
D
 
U
D
 
tO
 
rH
 
fA
Q
.
 
C
O
 
to
1L
 
^
 
>
^
-O
 
O
 
O
 
C
5
C
N
I
 
^T
 
-E
T
 
l^
A
 
G
O
.
 
•
 
.
 
-
 
.
 
•
 
-
rH
 
C
D
 
C
D
 
C
N
I
 
rH
LA
 
LA
 
LA
 
O
O
 
O
O
CNI
 
CNI
 
CNI
 
en
 
cn
:CD
 
CD
 
"CD
 
-CD
 
-CD
CD
 
CD
 
CD
 
CD
 
"CD
tO
 
tO
 
U
D
 
O
O
 
^3
-
C
N
I
 
C
N
I
 
C
N
I
rH
 
~
rH
 
rH
 
O
 
r>
A
jjjj>
'
 
.,.
.
 
jS
p
•t^
i
 
l^
>
 
*s
^
 
G
)^
.
 
a
 
-
•
-
*
-
.
tA
 
C
N
I
 
C
D
 
l-v
.
 
rH
•C
 
C
D
 
'
 C
D
 
^
-
 
*
 "
^
- CD
 
CD
'<*">
 
«PA
"*CD
 
CD
c
n
hA
 
r«A
CD
 
tO
f
-
 
0
u
.
 
^ACNI
O
 
C
D
CD
 
:
 
-
rH
 
C
D
LA
 
LA
CNI
 
CNI
CD
 
CD
"CD
 
CD
r
^
 
to
-
 
C
N
I
rH
 
rH
If
 
©
»O
€D
 
rH
CDrH
 
O
>x
_
tt
 
"
•»
fm
~
 
CO
"
*
"
 
C
D
OO
-o
*
C
DCD
fs
^
LArH
1rHCDOO01"CDCDt«^LA
_
<%d^J
C^9 _
C
DOO
C
D
f*
^
'CDcn
"COCQ_J
L
A
.
torHLACN4
C
D
C
D
fs
^
U
A
.
rHiQQ|^
„^
^
_
_
,
0o
J-L
.
CD^ACDcnhACpCQ_J•3-
LA
-
 
•
torHLACNI
CDCDf — LA
.
rHCQ
©
0
f^
\ _
rA
_
,
^2
(cont'd)
00
2 T T T
1
1 0 X
2+l
dx = 27T (ATAN (oo) - AT. AN (o) )
504
°oL 2 VT,
Where the assumption has been made that T » 0. 25, so that the contribution
to the integral for frequencies above 1 cps is negligible.
For the lag - washout combination we get:
G ( u > ) = —
and
00)
f
J do)
which yields:
Q,OO)
The bandwidth of the filtered signal is for the lag filter approximated by:
For the lag - washout combination:
BW
1
4T 4T cps
3.7 (cont'd)
) Using these relations and the previously derived formulas for Threshold
Exceedance Rate, TER, the Average Time above the threshold, T A , and
the probability distribution for time spent above a threshold, Table H-7
may be constructed. The data in Table H-7 applies to rough turbulent
conditions. Contributions to transient fault rates from quiescent parts
of the flight will be negligible since the rates increase faster than expo-
nential with increasing standard deviations. The data of Table H-7
should therefore be modified by multiplication by the fraction of the total
flight time spent in turbulent conditions.
It appears that the vertical velocity and perhaps even the roll angle signal
will cause nuisance failure problems. This prediction has been verified
by flight testing which has indicated that the SSSFJD mechanization with the
parameters of Table H-6 will lead to excessive nuisance failure rates.
However, the nuisance failure problems may be alleviated by opening up
% the washout filter thresholds for the vertical velocity and roll angle signals.
Deterministic Errors: Deterministic errors are signal deviations that may
be eliminated by compensation. Two main sources are sensor signal bias
and scale factor differences. Table H-8 displays biases and scalefactors
for a selected number of sensor signal deviations where the deviation of a
certain signal, A A, is the difference between this signal and the mean
value of three signals, i.e.
A A = A -~- (A + B + C)
O
except for the trackangle, ATKA, and localizer signals for which the
deviation is derived from two signals:
.A A = A - -L (A + B)
£l
1505
506
"Page missing from available version"
•3... 7, ., (cont'd)
The data in the table is based on flight test records. 1000 samples spaced
50 msec in time were used for the estimation of biases and scalefactors
during a period of aircraft maneuvering. The correlation coefficient of the
last column pertains to the correlation between a (bias compensated) signal
difference and the average signal level. A value close to one indicates a
significant scalefactor effect.
The Compensated Limited Average (CLA) algorithm is designed to effectively
remove bias errors but will not compensate for scalefactor errors. More
advanced algorithms have, however, been designed and tested by Boeing
which perform on-line scalefactor compensation as well as bias compensation.
Sensors exhibiting significant scalefactor errors are the pitch and roll angle
gyros for which the scalefactor deviations are in the order of 8-10%. Since
the pitch angle threshold level initially was set at 1° and the roll threshold
at 3. 5° in the ICPS - algorithm, a pitch angle of 1/0. 08 = 12° or a roll angle
of 3. 5° /0. 1=35° will cause a threshold exceedance if no scalefactor
compensation is employed. •»
For the study, it will therefore be assumed that effective scalefactor com-
pensation is being employed for these two signals so that the only remaining
nuisance failure sources are random in nature.
3. 8 ARCS TRANSIENT FAILUREJIATEQAND
The primary approach to sensor failure monitoring suggested for ARCS is
t h e C o m p e n s a t e d L i m i t e d A v e r a g e ( C L A ) a l g o r i t h m
a u g m e n t e d by -.- scalefactor. c o mp e n s~a t i o n ; ' T h e m ain components
of this algorithm is a static detector for slow ramp (or bias) failures and a
dynamic detector for oscillatory and hardover failures. The .static detector
monitors the signal bias level by comparison with prescribed thresholds. The
dynamic detector, consists of a threshold comparison followed by a time delay.
The strategies of handling faults occurring at duplex and higher redundancy
levels were defined in sections -*3. 4. 1 and 3. 4. 2.
_ GE |S
508; OF POOR QUALITY
3.8 (cont'd)
In the Compensated Limited Average (CLA) implementation, the
static threshold is set at a level at which a permanently deviating signal
'adversely would affect the control of the aircraft. This level, which may
be established by simulation or engineering judgement, is assumed to be
represented by the TH1 values of Table H-6 .
The bias integrating filter will, for this analysis, be treated as a first order
lag with a time constant selected to achieve an acceptably low nuisance failure
rate. No detection time delay will be implemented following the static
failure detector.
The dynamic detector, which monitors the raw unfiltered difference signal,
will also be assigned a level compatible with an acceptable nuisance failure
rate. The main function of this detector is to provide rapid hardover failure
detection.
Analysis indicates that all signals listed in Table H-5 except the three pre-
viously troublesome signals, i.e., vertical acceleration, vertical rate and
roll angle, may be monitored by setting the static and dynamic thresholds
corresponding to the TH1- levels of Table . H-6 . Note, however, that the
difference signals monitored by the CLA are the deviations between the
individual signals and their average value. This implies that the thresholds
defined in Table H-6 have to be reduced by one-half for .-the .CLA algorithm.
Those threshold levels will, with an integrating time constant of 10-15 sec,
define acceptable dynamic failure detection algorithms for all but the above
mentioned troublesome signals.
Regarding the vertical velocity signal, which is the most problematic, a dynamic
threshold at TH1 = 5.6/2 = 2.8 fps will result in an excessive rate. However,
monitoring at this level may instead be performed by the static threshold,
setting the bias compensating time constant equal to 10 seconds. Hardover
failures are detected and isolated without delay by a dynamic detector located
at five sigma i. e., 5x6 . 5/2 fps = 16 fps. The recovery delay period for
509;
3<>8 (cont'd)
~N faults at triplex or quadruplex redundancy discussed in section x3.<4vl will
be set equal to 10 sec and the threshold exceedance delay for a fault occurring
in duplex set equal to one second in agreement with the discussion of
section 3.. 4.2.
It is assumed that the vertical acceleration signal is monitored by a static
o
detector with TH1 = 0. 23 fps and a time constant equal to 5 sec. The dyna-
mic threshold is located at five sigma times the turbulent rms level, i. e.,
at 5 x 0. 08 = .4 fps.
Finally, it is assumed that the roll angle signal, like the vertical acceleration,
j is monitored at TH1 = 3.5°' by the static detector using a time delay of 5 seconds.
The dynamic threshold is set at five times the turbulent rms level, i. e.,
at 5 x 0.85° =4.25° .
The resulting ARCS transient fault rates are listed in Table H-9. .
-"* Note that the data pertains to turbulent conditions.. The actual experienced
exceedance rates probably will be significantly lower. In the ARCS
analysis the assumption was made that the nuisance rates will beaten times
lower than in Table H-9.
Initially, the analysis of the 1C PS sensor nuisance failures was performed
with the intention of using the results for the WWCS reliability analysis.
The predicted nuisance failure rates would, however, completely dominate
the results. To make the comparison between the ARCS and WWCS mean-
ingful, it was therefore assumed that the nuisance failure rates are the
same for these systems.
510
Z)ena:UJo:DiO(75
O
OI—ZUJ
•—
I
CO<orQUJ(-oI— »QUIor
Q
_
ooeniLU_lOQ
)ETECTOR
D
-
zUJ
 UJ
LU
 C/5
 <
Q
-
 
«
z>
 a
:
 uj
QI-_J1
-
X
 
U
J
 U
J
U
J
-.C
)
Q
.
 
z
2
ce
 <£
 uj
1O
f
Q
 
a
:
 U
J
KOU^JEUJ"oJ£HV)
UJ
 
0
-
1
—
 
•-
'
3oX«
•
UJo:Xi-ui
 ui
x--ei
LjJ
 (/)
 
«
f
_iz3
Q
-
 
«
=i
 o
:
 uj
1
-
z
X
U
J
U
J
UJ
 
—
 ID
Q
-
 z
 a<;
—
 
«
o
:
 o;
 uj•13O1cc
CC
 UJ
U
J
 Q
-
1
—
U^
J
h
-O
Q
-
 
-
f-
a
O
 toi-z
1
 
K
UJU
>
(
-
 O3iCOUJXUJzUJK-JU.ZU. 13OCO
0
 
0
 
<
=
>
>
 
X
 
<
JT
 
•=
*•
 
-^
b
o
o
-
«
-
«
-
«
X
X
 
X
ir\
 
LA
 
tn
C
M
 
C
M
 
CJJJ
S
 
2
 
S
X
X
 
X
N
-i
 
N
~i
'^co
 
<
"
 
o
a
.
 
,
 
LJ~>
u
_
 
cs
.
U
Q
 
.
"
r
-t
 
fl
 
r
H
CD
 
CT
>
 
CD
>
 
I
 
?
U
J
^
 
«
CD
 
—
0
 
-
1
 
C
D
 
2
X
 
%
 
—
1
in
 
uiz
C
MCO
 
<0
a
.
 
o
.
 
o
u
.
 
u
.
 
co
oo
 
•""
'
g
 
a
 
o
•
"
 
o
 
O
T
l^
CSJ
 
«o
«
 
t
Q
-
 
U
.
 
o
o
o
 
"2
•-J
 
-
 
hO
CM
 
CM
UJZ<
0
1
—
 
-J
:
 j
=
 
-
 j
=
 
-e
-
_J
 (9
«2
511
4..0 ARCS COVERAGE PARAMETER (RATE RATIO) ASSESSMENT
In this section, coverage parameters for permanent failures will be assessed.
As a ground rule for the study, it will be assumed that the coverage of any
single module failure which occurs at a triplex or higher redundancy level
is unity, i. e., that the system always survives a single failure. This failure
could, however, be either detected or undetected (latent). The system will
survive a single module failure in duplex operation if and only if the particular
failure is covered, i. e. i if and only if the failure is detected and isolated
and if successful redundancy degradation accomplished. Detection of a
faulty condition, which could be transient or permanent, will normally be
done via signal comparison. Isolation will be attempted based on reasonable-
ness tests and/or module selftest. When a failed module has been success-
fully isolated, redundancy degradation will be implemented by continuing
operation using the remaining, operational, module.
The most difficult redundancy management task is proper failure isolation.
Reasonableness tests, which in scope may range from a simple hardover
detection to sophisticated algorithms based on modern systems theoretical
methods ("analytical" or "functional" redundancy), may be applied for fail-
ure isolation. Another possibility is self-test monitors or subsystem level
tests of several modules, for example wraparound input-output test.
In the following, coverage, or rather, permanent failure rate ratio parameters,
will be assessed for the different ARCS modules.
n
512
4.1 COMPUTER COVERAGE
It was recognized early in the ARCS program that an accurate estimation of
the (second failure) computer coverage would be virtually impossible without
detailed drawings, specifications and circuit diagrams over the computer.
Furthermore, even if this documentation were available, the task of estimat-
ing the coverage would be large enough to fall outside the scope of the initial
ARCS program effort. The decision was therefore made to postulate a
certain coverage value which was deemed to be conservative for the ARCS
system. The numbers postulated were 0.95 for the second failure coverage,
i. e. 95% of surviving a failure in duplex, and 0. 90 for the probability of
detecting a failure when operating in simplex. If it could be demonstrated
that the reliability requirements would be satisfied assuming these numbers,
the potential feasibility of the ARCS system would be established. In the case
of a future hardware implementation of the system, these assumed parameter
values will have to be demonstrated by testing on the actual hardware , possibly
supported by a gate level simulation.
4.2 SERVO STAGE RATE RATIOS
At the time of the reliability study the results from the General Electric
coverage study (Appendix G) were not yet available. Tentative values for
the servo rate ratios were therefore used in the analysis. The coverage
in duplex was assumed equal to 0.95 and the probability of detecting a
. failure when operating in simplex was assumed to be 0.90. These values
were later, by the result of the GE coverage study, shown to be repre-
sentative although somewhat conservative.
513
4.3 SENSOR RATE RATIOS
Rate ratios for the various sensors were assessed based on several dif-
ferent sources. However, it was very difficult to get firm estimates for
sensor rate ratios from these sources. The only sensors for which adequate
FMEA study results were available, were the Radio Altimeter (R/A) and
the ILS receiver. The rate ratio assessment for the other sensors had to
be based on rather fragmentary data supported by engineering judgement.
As a general ground rule it was assumed that, except for the R/A and ILS,
no special dedicated monitoring circuitry was to be used. The monitoring for
each of these sensors was assumed to mainly consist of a reasonableness
test capable of isolating hardover failures and to some extent passive failures.
Sources consulted to establish sensor rate ratios are listed below:
"Fail-Operational Autoland Failure Analysis" Boeing Document No. D6-33219
Vendor Failure Analyses for the Radio Altimeter and ILS Modules
Boeing In-House Experience
United Airlines Maintenance Records
Sensor Failure Mode Data from Airline Maintenance Experience extracted
by Boeing
• "Definition Study for Advanced Fighter Digital Flight Control System"
AFFDL-TR-75-59 (Rough Draft)
In addition to extracting information from these sources, a special monitoring
approach was developed for the R/A sensor. This sensor is critically needed
for the autoland function and exhibits a high coverage sensitivity. It is there-
fore desirable to obtain a high coverage for this sensor. In addition to achiev?->
ing this goal, the monitoring algorithm developed demonstrates the feasibility
of using "analytical redundancy" to improve sensor coverage.
The method utilizes the redundant information available in the vertical accel-
eration signal from the Normal Accelerometer, together with the vertical
velocity signal from the Air Data Computer, to artificially construct a redun-
dant altimeter signal. This signal is of high enough quality so as to serve
as a replacement of an R/A signal «f it fails. This implies that a R/A fail-
ure at duplex redundancy could be perfectly covered by voting between three
signals, one of which is artificially created. Figure H-8 shows an example
of the performance of the analytically redundant signal during the final phase
of a landing. A detailed description of the method is given in reference H-6.
514 OF POOR
ILUa
toCDCDCNJ
C
D
LA
C
D
C
D
LUUJ
f
-
<cQUJa:LU
(=
1o•—I
h
-
COI
C
D
LTV
CDUD
CDcsi
515
4.3 (cont'd)
A summary of the sensor rate ratio parameters used in the study is given
in Table H-10. These values are to be considered conservative, except
for the R/A and possibly the ILS, in that the coverages, r _, probably
^o
could be increased substantially by the use of special hardware monitoring
circuits.
516
TABLE H-10 ASSESSED SENSOR RATE RATIOS
SENSOR
R/A
ILS
YAW RATE
ACCEL
DG
COMPASS
VG
CONTR, FORCE
DADC
PI TOT/STATIC
ISAD GYRO
r
23
1,0
,99
,60
,75
,75
,75
,75
,50
,90
,50
,95
r35
0,0
0,01
0,75
0,25
0,40
0,40
0,40
0,50
0,10
0,50
0,10
517
REFERENCES
H-l "An Improved CLA Signal Selection and Failure Detection Algorithm for
Continuous Signals". Boeing Coord Sheet No. B-8244-10-096.
H-2 "Fail-Operational Autoland Failure Analysis". Boeing Document D6-33219.
H-3 "Stationary and Related Stochastic Processes", H. Cramet and
M. R. Leadbetter, Wiley 1967.
H-4 "Statistical Distributions of MTBE Predictions", Boeing Computer Services
Coord Sheet No. EM-11.
H-5 "Redundant Flight-Critical Control System Evaluation", FAA-SS-73-2-2.
H-6 "Use of State Estimators to Cover Radar Altimeter Failure During Autoland"
Boeing Coord Sheet B-8244-10-121.
518
[APPENDIX i I
[PROCESSOR CROSS-CHANNEL COMMUNICATION LINK STUDY |
l.Q INTRODUCTION
The objective of this study was to survey and tradeoff different methods of imple-
menting the processor cross channel communication function, taking into account
the ARCS operational requirements and design concept constraints. The more
significant of these are:
1) A data transmission rate requirement of at least 20, 000 words/s
2) A method of implementation which does not significantly impair the
effective processor throughput.
3) ARCS design criteria and groundrules; in particular the requirement of no
interference^ between computers.
4) Redundancy management related requirements and constraints.
Among the many possible cross channel link configurations only a few will satisfy
all the ARCS requirements. These configurations are identified and compared • '
with respect to single point failure hazard, reliability, cost and adaptability to
optical implementation.
Ife is shown that the originally proposed ARCS baseline communication link is'the
implementation that presently ..best satisfies the requirements.
2.0 POSSIBLE CROSS CHANNEL TRANSMSSIQN-rMPLEMENTATIQNS
In the design of a cross channel communication link, several parameters are of
interest. The more significant of these are identified below.
2.1 TRANSMISSION TECHNIQUES
The data exchange on a bus is either unidirectional or bidirectional. Further-
more, en a bidirectional bus the communication may either be half-duplex or
full-duplex. Half-duplex implies that communication may take place in both
directions but not at the same time while full-duplex |implies_ simultaneous com-
munication in both directions.
519:
2.2 REDUNDANCY PRINCIPLES
To obtain the required survivability, the cross channel communication function
has to becr.edundant so that no single failure will cause a system failure.
This redundancy may either be realized by transmitting over active
redundant links or by using a primary active link backed up by passive
spares. The latter implementation will require half-duplex or a full-
duplex mechanization.
Only active redundancy will be acceptable for the ARCS, because of the
difficulty of proving that all processors in the system will be able to correctly
switch to a spare upon any failure of a primary link, i. e., the difficulty
of showing that no single point failure source exists.
2.3 TRANSMISSION MODES
Four different transmission modes will be considered, independent,
sequential, synchronized and command/response. Independent trans-
mission implies that the data exchange is accomplished without any inter-
action whatsoever between processors.
By sequential transmission it meant the scheduling in time between different
processors such that transmission is performed in a prescribed sequence, each
processor transmitting in a given time interval.
In synchronized transmission the processors exchange information on a word
by word basis, transmitting simultaneously and storing the received data
in a coordinated, synchronized manner.
The command/response mode is self-explanatory.
Of the four modes considered above, the command/response mode may
directly be ruled out by the ARCS "noninterference" groundrule.
520'
2.3 (cont'd)
If synchronized transmission is to be used for ARCS, the timing reference
for the data exchange has to be generated independently in each processor. This
ffecessaidlyiLMowers's the effective transmission rate, since the transmission
intervals have to be expanded to accommodate time skew between channels.
Synchronized transmission is therefore a less desirable alternative for
the ARCS.
With regard t tial transmission, t --ervation is first made that
Sequential transmission, offers no advantages over independent trans-
tMs mode offers no advantages over indcpj. • t transmission for umdirc '. •->
mission forruM'di'neeMo'naiabusses'.^Sequelicing'betweenilprvOCBSsor.serequiTes sched-
tass©So Second., the e . „.,.« requirersrs^t.r o.
uling, which has to be managed independently by each processor to satisfy
the noninterference groundrule. j;iftaJljspcsequentiMatijansiiaissign'Cgr.eatly
increases the time required for the dataiexchange if not compensated by
higher transmission rates. The sequential mode JtheEeforeeistal'so tess;sl
desirablesfb-rcth'e"A^6S|or the ARCS,
This leaves independent transmission as the only desirable transmission
mode for the ARCS, crosschannel communication.
2.4 PROCESSING TIME CONSTRAINT
It is estimated that up to seventy-five percent of the total processing time
will be required to handle the redundancy management and feantrol law tasks
for the ARCS. In a 20 ms frame, this leaves 5 ms for other tasks including
c ro s s cha'mjieJadMaitEansjmi SBLQn.re^h^jcrje;quised^.?njSuntiof dataigxchangeid <per
f'ramejyaMesylfcU^iUb^m^ ^he-iav<eragej,pj'^cessor
execution time per transmitted word is therefore 12. 5 \i s. From this, it is
clear that the data exchange should take place with a minimum of processor
2. 5 IMPLEMENTATION CONFIGURATIONS
Figur,eirl5ShoiH5Sttlnieepossible bus configurations. Configuration l(a)
t
uses six unidirectional, dedicated, transmission links. The delta config-
uration oMi^re.IK! i(b) consists of Bidirectional busses connecting the
three processors,,/and:figireI-l(c) shows a "TEE" configuration using a
OF POOR QUALITY I 521
KA), 6 - UNIDIRICTIONAL LINKS
KB), DELTA
I(D), ALTERNATE
FiGUREj-lCRoss CHANNEL LINK CONFIGURATIONS
2. 5 (cont'd)
common bidirectional bus. E3.^pre<I-i](d), the "asymmetrical TEE" shows an
alternate to the TEE which requires less cabling since the cable junction
has been :m©ved£i!hVtde :^0nei.pf'ofeIeSsor.!iiteteS5Ea'ce.
2. 6 SINGLE POINT FAILURE HAZARD AND RELIABILITY
A fundamental requirement for ARCS is that the system function should
survive any single failure. Configuration l(a) satisfies this requirement,
since loss of one bus leaves the communication between two processors
intact. A bus failure will therefore, at most, lead to redundancy degrada-
tion from triplex to duplex, provided the failure is properly localized by
all three computers. The possibility of bypassing a faulty link by relaying
over another computer is»r.ul'e~d1ou<tt by transmission rate constraints and the
ARCS redundancy management design.
A similar situation exists for the delta configuration where any bus failure
will leave the communication between two computers intact. As in the
previous case it is important that all computers are able to localize the
fault.
A different situation exists for the TEE configuration where a single worst
case failure at the bus junction conceivably could cause total loss of com-
munication between computers. Redundancy is therefore required for this
configuration. Eiilgp\e><B-2 shows a possible alternative luBifng^a^trdipUi'eated
"alternate TEE" configuration.
With full-duplex implementation, the required fault tolerance could con-
ceivably be obtained by two TEE's instead of three. A symmetric configura-
tion is however required in order to satisfy the ARCS groundrule of identical
computer hardware and software.
523'
FIGURE 1-2 TRIPLICATED TEE CONFIGURATION
•?/•/ 524
j !
2.6 (cont'd)
Note that with unidirectional instead of bidirectional transmission the
symmetric configuration of figure 1-2 reduces to that of figure I-l(a). The
bidirectional implementation exhibits higher fault tolerance but will be
considerably more complex than the unidirectional. This increased
complexity is hardly justifiable, since the resulting reduction in system
failure probability only amounts to a few percent.
From the above considerations, we find that the configurations of figure I-i(a),
using six unidirectional links, and the delta configuration of figure I-i;(b),- using
three bidirectional full-duplex links, are the two remaining cross channel
link contenders for the ARCS. However, the delta implementation will
require somewhat more complicated hardware and will be more difficult
to adapt to optical transmission, giving a slight edge to the configuration
of figure I-l(a).
In conclusion, the most effective cross channel link implementation uses
six unidirectional links. Each processor transmits data to the other two
processors independently via two dedicated links.
3. 0 DATA RATE AND TRANSMISSION FORMAT
There are several considerations in the selection of a data format for the
cross channel communications link. Those to be considered here are:
Data Word Size
Data Rate
Hardware Complexity
Reliability of Communications
Failure and Error Detection
Existing Standards
525
3.1 DATA WORD SIZE AND DATA RATE
A data word will consist of 16 data bits, a 10-bit label and a parity bit
for a total of 27 bits. If a data transfer rate of 35, 000 words per second
is required the bit rate will be 0.945 x 10 bits per second (bps) exclusive
of overhead for bit, word" and message synchronization. The baseline cross
channel configuration requires a minimum of overhead for protocol, hence
a bit rate of 2 x 10 bps would provide adequate margin and some growth
potential.
c
Serial data transfer rates of 5 x 10 bps are well within the state-of-the-art
so it is clearly feasible to use serial data transmission in this application
and reap the associated benefits of simplicity, low cost and reliability.
3.2 TRANSMISSION FORMAT
Modern high-rate serial data transmission standards typically specify a
self-clocking code transmitted on a single signal path per channel. This
approach minimizes inter-connecting cables and avoids timing errors
frequently associated with parallel clock, data and synchronization lines.
Although MIL-STD-1553A pertains to a large-scale data bus system, the
serial data transmission format specified therein provides a good model
for the cross channel data format. This standard employs bipolar biphase-L
encoded data in a word format that begins with a 3-bit synch pattern, 16 hits
of data and a parity bit for a total of 20 bit-times per word. The synch
pattern is used to obtain bit synchronization and to signify a data word or
a command/status word. A message consists of a command or status word
followed by up to 32 data words. This format has been found suitable for
fiber-optic as well as wire data transmission for which it was designed.
A recommended data transmission format based on MIL-STD-1553A and
adopted for the cross channel link requirements is as follows:
526
3.2 (cont'd)
Sync Pattern: Command/Status and Data Sync (3 bits, in-valid code)
Data Encoding: Biphase-L (Manchester)
Word Format: Word Length 30 bits
Sync (3 bits)
Data & Label or Command/Status (26 bits)
Odd Parity Bit (l bit)
Word Types: Command, Status, Data
Message Format: Begin with Command or Status Word, followed
by up to 64 data words and an optional status word
containing a block check sequence.
This format should provide sufficient flexibility to accommodate a number of
different communication protocols with a minimum of overhead. The 10 label
bits which are not part of the MIL-STD-1553A format are to effect proper
placement of data in a 1024 word RAM in the computer input section. The
maximum length of one message will be 66 words of 30 bits each for a total
of 1980 bits. Thus, a transfer of one block of data will require approxi-
mately 1 millisecond at a 2 megabit rate and 0.4 millisecond at a 5 megabit
rate.
3.3 COMMUNICATIONS FAILURE AND ERROR DETECTION
The recommended communications format described above, together with
a protocol suited to the selected cross channel link configuration, provides
many ways to detect and isolate failures in the cross channel link. Mes-
sage integrity checks that can be performed by hardware at each receiver are:
Valid Sync Patterns
Valid Biphase-L Bits
Valid Parity Bits
Valid Address
Correct Message Length
Valid Check Sequence (optional)
Valid Status Bits
The recommended message format provides command and status words that
would allow the computers to pass instructions and failure status from
one to another and to control the flow of traffic. A message comprised of
a command word can request status or data and a reply would include a
status word as a minimum plus any required data. Failure of a computer
to respond to a command from a neighbor would signal the requestor that a
failure has occurred.
527
3.3 (cont'd)
Several error detection and error correction schemes have been developed
for use in data communications systems. Selection of an error detection
scheme for the cross channel link can be made on the basis of channel quality,
consequence of an undetected error, added communications burden and cost.
Probably the simplest and most widely used forward error detection scheme
is parity. This scheme works reasonably well in serial data channels where
bit errors are independent and the error rate is small. A parity check fails
to detect errors that occur in pairs within the span of bits1 covered by parity.
The main advantage of parity is its simplicity of implementation and low cost.
A more sophisticated and effective error detection scheme is the cyclic
redundancy check (CRC). In this scheme a number of check bits (a code) is
computed as a group of data bits is being sent, and the check bits are sent
following the data. At the receiver a similar computation is performed on
the received data and any disagreement of the receiver derived check bits
with those sent following the message signals an error in the message. This
check can detect all errors within a span of bits equal to the length of the
code and it can also detect a large percentage of errors scattered throughout
the data and exceeding the span of the code. It can be shown that a 16 bit
CRC protecting 64 words of 26 bits each can yield an undetected bit-error
rate at least 1000 times smaller than would be achieved with a parity check
on each word. The CRC is easy to implement because the necessary compu-
tation at each terminal can be performed by a circuit in a single 14-pin
package. The main disadvantage to this scheme is the delay of the integrity
check until the end of a data block transfer. Thus, the data block must
usually be held temporarily at the receiver until the check can be completed.
A FIFO memory could perform this function but is an added cost.
In the cross channel link the communication channel quality should be such
that the raw bit error rate will be completely negligible. Thus, the function
of the data integrity check will be to defeat errors caused by equipment
failures or abnormal interferences such as electrical transients. These error
sources are likely to create burst errors which may defeat a simple parity
3.3 (cont'd)
check. Hence, a 16-bit CRC check for each block transfer of 64 words or
less is recommended. We also recommend word parity as a backup to
the CRC check for detection of isolated, widely separated bit errors.
These two schemes working together will form a powerful error detection
system. A forward error correction scheme that would add somewhat more
complexity is not necessary in this application because a detected error
could initiate a request to repeat the message or the defective data block
can simply be discarded.
4.0 OPTICAL VS ELECTRICAL COMMUNICATIONS
In the previous sections a comparison was made between various configura-
tions without regard to the transmission medium being used. In this section
the advantages of optical vs electrical cross channel communications will
be explored. Discussions of wire and fiber optic implementation of the
baseline configurations are also presented.
4.1 ADVANTAGES OF FIBER OPTIC DATA TRANSMISSION
The transmission of digital signals over fiber optic lines has some unique
features which make it an attractive alternative to conventional wire tech-
niques for cross-channel data communications. Some advantages of fiber
optic data transmission are:
Large Bandwidth
EMI/EMP Immunity
No Cross Talk
No Electrical Short Circuits
No Ground Loop Problems
Electrical Isolation of Terminals
Light Weight
Graceful Connector Degradation
The reliability of a fiber optic data link, from a signal integrity viewpoint,
is very high due to its immunity to electromagnetic interference (EMI),
electromagnetic pulse (EMP) "and cross talk. This reliability is enhanced
by the fact that the optical channel is a non-conductor of electricity; thus,
there is no possibility of data link ground loops or electrical short circuits .
The electrical isolation of transni tter and receiver precludes the propagation
1529
4.1 (cont'd)
of an electrical fault from one processor to another via the cross channel
data link. Furthermore, fiber optic connectors tend to degrade grace-
fully, reducing the chance of intermittent connections sometimes exper-
ienced with wire system connectors.
The data rate of a fiber optic data link implemented with off-the-shelf LED's
is limited by the speed of the light emitting diode. Typical LED rise and fall
times are 10-15 nsec. Experimental LED's are available with rise times as
low as 1 nsec. Therefore, data rates higher than 65 Mbits/sec are possible
with presently available components. This is far greater than the data
rates possible with twisted shielded pair wire cross channel data links.
The mechanical ruggedness of fiber-optic data links is currently being
investigated. Some preliminary test results suggest that the mechanical .,
reliability of fiber optic cable is greater than wire. Furthermore, new
fibers and protective jacketing materials are being developed which should
provide uncontested mechanical integrity.
4.2 IMPLEMENTATION OF THE BASELINE CROSS CHANNEL CONFIGURATION
The baseline cross channel configuration, shown in figureI-l(a), employs
6 unidirectional data links between the 3 processors. Each processor has
two transmitters and two receivers. The relative complexity of imple-
mentation with wire and fiber optics is comparable. With the short distances
between computers and a 2 megabit per second or lower data rate, either
system can be implemented without difficulty.
To obtain electrical isolation between computers in the wire system, it is
necessary to use transformer coupling in the transmitter and/or receivers;
Although the desired isolation can be verified at the time of manufacture, it
is difficult to measure, hence insure, the isolation with the equipment in
service. With a fiber-optic system, however, complete electrical isolation
between terminals is inherent in the transmission medium and electrical
fault propagation from one processor to another via the data link is not
possible.
530
4. 3 COST AND STATE-OF-THE-ART
The relative cost of wire and fiber optic data transmission systems is a
function of both relative complexity and state-of-the-art. Wire data trans-
mission systems are highly developed whereas their fiber-optic counterparts
have only recently emerged as contenders. Thus, it is to be expected
that for systems of equal complexity, a wire implementation would be
lower in cost at this time. Within a few years, however, fiber optic data
transmission systems should be fully cost competitive with wire and a
selection would be based largely on other factors.
The relative complexity of wire and fiber optic implementation for the
ARCS cross channel link can be assessed with the aid of WM&U-3.
The computer I/O, encoder and decoder are virtually identical for either
implementation, thus, any basic difference in complexity or cost will be
found in the transmitter, receiver and transmission medium. Wire and
fiber optic transmitters are both quite simple and comparable in complexity .
A fiber optic r.eceiKe'r.tfoKa',:sh6ir1>JdaJ;aitlink'Jor;]smiaM datafbus'fts^no more
complex than a wire line-recedwer.'aj;tp1rLe.s.entS because either can be pur-
chased as single -package integrated circuits. The wire line-receiver is
less costly at present because of product maturity but this is probably a
temporary situation. Finally, as a transmission medium, optical fibers
and terminations are currently more costly than wire but promise a cost
advantage in the future because of simplicity and low-cost raw materials.
However, because of the small amount of cable required to interconnect
the computers, small differences in cable costs will have little effect on
overall system costs.
In conclusion, a wire cross channel link has a cost advantage at present
due to maturity of the technology. The cost difference is difficult to assess
on a quantitative basis because of the rapidly changing state of fiber optic
technology. All indications point to increased use and reduced cost of
fiber optics in the near future and ultimately the choice will rest almost
entirely on the system performance and failure effects trades.
ORIGINAL PAGE IS
OF POOR QUALITY 531 ,
1
 
1
o
 
'
1
 
>
 
1
11111I
 
-
1
 
C
M
1
 
C
£
fffil
11
O
H
LJJ
'&
:
mai&
1
 
3
 
i
Q
.
•
 
-^•
'
1
 
1-3
1111
1111
r
 
'
C£LUt-l
LUOLUo:
11111
I
I
-i
-
11111
 
^
1
 
a:
LU
I
 
^
1
 
S
:
 
,
1
 
S0
|11111
LU1
-
~COcl1—t ,Q£LUQOOzLUte
.
~
1111
moo_JPQfLUQce<M-lCOCOoLU
or3CD
5
V
1
53
2
5.0 SUMMARY
A cross channel communication link implementation, which uses six
unidirectional links between the three processors, is recommended for
the ARCS. Each processor transmits data to the other two processors via
two dedicated links in an independent manner. This implies that each pro-
cessor has the ability to store data directly into a receiving buffer without
interfering with the function of the receiving processor.
The configuration suggested, which coincides with the ARCS baseline cross
channel link configuration, is well suited for adaption to optical implementa-
tion. The cost of optical transmission is presently higher than for elec-
trical but this difference is insignificant in relation to the total system cost
and is expected to decrease in the future.
533
-1/2 X 11 INCH CROPS
REFERENCES
1. Morris, Everett W.: "Stumbling Blocks Can Be Avoided When Seeking Airworthiness
Approval of Digital Flight Control Systems." AIAA paper 75-576, April 2-4, 1975.
2. "Airworthiness Requirements for Automatic Landing—Including Automatic Landing
in Restricted Visibility Down to Category III." BCAR paper No. 367, prepared by
Air Registration Board, Section D, Airplanes, January 1970.
3. "Environmental Conditions and Test Procedures for Airborne Electronic/Electrical
Equipment and Instruments." Document DO-160, prepared by Radio Technical
Commission for Aeronautics SC-112, February 1975.
PRECEDING PAGE BLANK NOT FILMED
