The reliability of small digital controllers by Pearson, Jonathon C.
Durham E-Theses
The reliability of small digital controllers
Pearson, Jonathon C.
How to cite:
Pearson, Jonathon C. (1983) The reliability of small digital controllers, Durham theses, Durham University.
Available at Durham E-Theses Online: http://etheses.dur.ac.uk/7217/
Use policy
The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or
charge, for personal research or study, educational, or not-for-proﬁt purposes provided that:
• a full bibliographic reference is made to the original source
• a link is made to the metadata record in Durham E-Theses
• the full-text is not changed in any way
The full-text must not be sold in any format or medium without the formal permission of the copyright holders.
Please consult the full Durham E-Theses policy for further details.
Academic Support Oﬃce, Durham University, University Oﬃce, Old Elvet, Durham DH1 3HP
e-mail: e-theses.admin@dur.ac.uk Tel: +44 0191 334 6107
http://etheses.dur.ac.uk
THE RELIABILITY OF SMALL 
DIGITAL CONTROLLERS 
by 
Jona than C.Pearson BSc(Eng) , ACGI 
The copyright of this thesis rests with the author. 
No quotation from it should be published without 
his prior written consent and information derived 
from it should be acknowledged. 
Thesis submi t ted for the degree of Doctor of 
Phi losophy in the Faculty of Sc ience 
Universi ty of Durham 
September 1983 
25. JAN. 1984 

Dedicated to 
MY MOTHER AND FATHER 
1 -
ABSTRACT 
inc reas ing use is be ing made of smal l d ig i ta l cont ro l le rs in Industry 
and C o m m e r c e . The fa i lure of such cont ro l le rs is important s ince it may 
ei4Uer 
c a u s e ^ plant to become unsafe or the in terrupt ion of product ion. Fau l t -
to lerant techn iques a re d iscussed for improv ing the rel iabi l i ty of d ig i ta l 
con t ro l le rs with spec ia l re fe rence to the development of a hybrid 
e l ec t r omechan i ca l gas governor , whose e lec t ron ic cont ro l ler is an example 
of a smal l d ig i ta l cont ro l le r . Three m ic rop rocesso rs are used in a two out 
of three major i ty vot ing conf igura t ion and the memory is Hamming code 
p ro tec ted . Redundancy techn iques are used to protect against faults in 
o ther jaarts of the con t ro l le r and it will to lerate most c lasses of 
t rans ient fault . 
When c o m p a r i n g des igns or at tempt ing to meet rel iabi l i ty c r i te r ia , it 
is necessary to p red ic t the rel iabi l i ty of a system and its indiv idual 
componen ts . Several sources of fa i lure rate predic t ion are compared and 
the wide var ia t ion in the fa i lure rates of integrated c i rcui ts is 
h igh l igh ted . The c o m p a r i s o n conc ludes by recommend ing which rel iabi l i ty 
data source is likely to be most accura te for each type of component . 
The gas governor is an example of a repai rab le system and analysis is 
deve loped lor p red ic t ing the improvement in rel iabi l i ty for repa i rab le 
redundant systems and for de te rmin ing the opt imum main tenance and repair 
t imes for equ ipment . 
The tes t ing of redundan t systems is di f f icul t because of their 
complex i ty , and under cer ta in c i r cums tances the redundancy can masl< des ign 
faul ts. Test ing methods us ing complex test equ ipment are descr ibed , as well 
as the tes t ing of the exper imenta l cont ro l le r . 
A review is inc luded of other fau l t - to le rant systems. Al though the 
work on la rge compu te rs is not d i rect ly app l icab le to smal l cont ro l le rs . 
. many of the techn iques can be used. 
- 11 
Acknow ledgements 
I shouid like to express my grat i tude to the fol lowing people. To Or 
Clive Preece . my Superv isor , for his help, encouragement , f r iendship , and 
ass is tance in the prepara t ion of this Thesis. To the Bri t ish Gas 
Corpora t ion tor the prov is ion of a Research Scho larsh ip and equipment. To 
Dr Ken Jenk ins for his help, encouragement , and the acquisi t ion ot 
equ ipment , and also for the c lose l inks between the Engineer ing Research 
Stat ion and the Depar tment of Eng ineer ing . I would like to thank Mr Robert 
Halse for nis neip and d iscuss ion . To Dr David Smith for his interest ana 
va luable d iscuss ion about this research . To the Elect r ica l Technic ians for 
jobs too numerous to ment ion and to Motorola for provid ing much useful 
re l iabi l i ty i n io rmat ion . Final ly I would tike to thank Gi l l ian for her 
love and encou ragemen t in the prepara t ion of this Thesis. 
CONTENTS 
Page 
i x 
X I 
LIST OF FIGURES 
LIST OF TABLES 
LIST OF SYMBOLS x i . i 
CHAPTER i The Need for a Rel iable Contro l ler 1 
1.1 NATIONAL GAS NETWORK 1 
1.2 PNEUMATIC GOVERNORS 2 
1.3 LIMITATIONS OF PNEUMATIC GOVERNORS 4 
CHAPTER 2 A Review of Other Fau l t - to le ran t Contro l lers 7 
2.1 SMALL CONTROLLERS 7 
2.1.1 TMR cont ro l le rs 7 
2.1.2 Fa i l - sa fe cont ro l le r 8 
2.1.3 Built in test equ ipment 9 
2.2 LARGE COMPUTERS 9 
2.2. \ Space shutt le and avionics 9 
2.2.2 Commerc ia l compute rs 11 
2.2.3 Te lephone exchange computers 12 
2.2.4 Safety mon i to r ing computer 12 
2.2.5 P D P l l compute r 13 
2.2.6 Railway app l ica t ions 13 
2.2.7 Fau l t - to le ran t software 14 
2.3 SUMMARY 14 
CHAPTER 3 Fai lure of Components 17 
3.1 FAILURE DISTRIBUTION AND MECHANISMS 17 
3.2 FAILURE RATE MEASUREMENT AND ACCELERATION 18 
3.2 1 Acce le ra ted test ing 19 
3.3 FAILURE RATE PREDICTION 21 
3.3.1 CNET 23 
3.3.2 Nat ional cen t re of systems rel iabi l i ty 24 
I V 
3.4 
3.5 
3.3.3 RADC field data 25 
3.3.4 European space agency 25 
3.3.5 Manufac tu rers test ing 25 
3.3.6 S imple models 26 
3.3.7 GIDEP compute r data base 26 
3.3.8 EuReData 27 
3.3.9 M iL -217 data base 27 
COMPARISON OF FAILURE RATE DATA 27 
3.4,1 Resistors 28 
3.4.2 Capac i tors 28 
3,4,3 So ldered jo ints 28 
3,4.4 W i re -w rap connec t ions 28 
3.4.5 Edge connec to rs 29 
3.4.6 In tegrated c i rcu i ts 29 
3.4.7 TTL in tegrated c i rcu i ts 32 
3.4.8 6800 m ic rop rocessor 32 
3.4.9 8080 mic roprocessor 33 
3.4.10 EPROM 34 
3.4.11 Bipolar ROMs 34 
3.4.12. Dynamic RAM 34 
3.4.13 Stat ic RAMs 34 
3.4.14 Compar i son with other f ield data 35 
3.4.15 Recommendat ions 36 
COMPARISON OF DIFFERENT DEVICE TECHNOLOGIES AND 
ENCAPSULATION 37 
3.5.1 CMOS versus TTL 38 
3.5.2 Hermet ic versus non -he rme t i c encapsulat ion 39 
3.5.3 Recommenda t ions 43 
3,6 SCREENING AND METHODS OF IMPROVING FAILURE RATES 44 
3,6,1 Derat ing and coo l ing 
- V -
3.7 SOFTWARE AND TRANSIENT ERRORS 47 
3.8 SUMMARY 50 
CHAPTER 4 Pred ic t ion of Reliabi l i ty Improvement due to 
Fau l t - t o le ran t techn iques 52 
4.1 METHODS OF EXPRESSING RELIABILITY 52 
4.2 EFFECT OF REDUNDANCY ON RELIABILITY 54 
4.3 EFFECT OF MAINTENANCE ON A SYSTEM 55 
4.3.1 Redundant systems 56 
4.4 METHODS OF EXPRESSING IMPROVEMENT 59 
4.5 SUMMARY 60 
CHAPTER 5 Techn iques for Improv ing a System's Fau i t - to ie rance 62 
5.1 LEVELS OF FAULT-TOLERANCE 62 
5.2 DESIGN TOOLS 63 
5.2.1 FMECA 63 
5.2.2 FTA 64 
5.3 HARDWARE 64 
5.3.1 Watchdog t imer 64 
5.3.2 Snake 65 
5.3.3 Power supply levels 66 
5.3.4 Output ver i f icat ion 66 
5.3 5 Componen t redundancy 66 
5.3.6 Memory pro tect ion 67 
5 3.7 Tempera tu re 67 
5.4 SOFTWARE 67 
5.4.1 Except ion hand l ing 68 
5.5 SELF-TESTING 69 
5.6 SUIVtMARY 70 
CHAPTER 6 EFFECT OF SYSTEM ARCHITECTURE ON RELIABILITY 
AN^DCOST 72 
6.1 CHOICE OF MICROPROCESSOR 72 
6.1.1 S ing le ch ip m ic rop rocesso rs 72 
- V I -
78 
78 
86 
86 
6.1.2 E igh t -b i t m ic rop rocesso rs 72 
6.1.3 S ix feen-b i t m ic rop rocesso rs 73 
6.2 MAJORITY VOTING 74 
6.2.1 Hardware 74 
6.2.2 Sof tware 75 
6.3 THE USE OF REDUNDANCY AND INCREASED COST 75 
CHAPTER 7 Background to Design of Governor Contro l ler 
7.1 SPECIFICATION 
7.2 MECHANICAL VALVE 79 
7.3 NON FAULT-TOLERANT CONTROLLER 82 
7.4 METHODS OF INTRODUCING FAULT-TOLERANCE 83 
CHAPTER 8 Descr ip t ion of Exper imenta l Contro l ler 
8.1 CONSTRUCTION 
8.2 TMR MICROPROCESSOR BOARD 86 
8.2.1 M ic rop rocessor block 87 
8.2.2 Voters 87 
8.2.3 Self synchron is ing c lock 88 
8.2.4 Synchron isa t ion hardware 89 
8.2.5 Reset c i rcu i t ry and watchdog t imers 92 
8.2.6 RS232 in ter face 94 
8.3 MEMORY BOARD 94 
8.3.1 Decod ing 96 
8.3.2 Ci rcu i t descr ip t ion 96 
8.3.3 Address decod ing and "spare" EPROMs 97 
8.3.4 Memory matr ix 98 
8.4 INPUT/OUTPUT BOARD 98 
8.4.1 Real t ime c lock 99 
8.4.2 Mul t ip lexer and A / D conver ter 
8.4.3 Stepper motor dr iver 
99 
100 
8.4.4 So leno id dr ive c i rcu i t ry 
V l l 
8,4,5 Pressure t ransducers 
PNEUMATIC TEST RIG 
ESTIMATED RELIABILITY 
8.6.1 Revised fa i lure rate pred ic t ion 
CHAPTER^9 Governor Cont ro l le r Software 
8.5 
8.6 
9.1.1 
9.1.2 
9.1.3 
9.1.4 
9.1.5 
9.1.6 
9.1.7 
9.1.8 
9.1.9 
MAIN 
RESYNC 
MERROR 
CNTRL 
PRESSR 
RBKGEN 
BLOCK 
SLFTST 
SFAIL 
9.1.10 WTRAP 
9.1.11 SNAKE 
9.1.12 DFAULT 
9.1.13 INITL 
9.1.14 COUT 
9.1.15 COUTBF 
9.1.16 PRBUFF 
9.1.17 MSGE 
9.1.18 NMOUT 
9.1.19 TIMLOG 
CHAPTER 10 Test ing Of Fau l t - to le ran t Systems 
10.1 SYSTEM DEBUGGING 
10.1.1 I n -c i r cu i t emula t ion 
10.1.2 Logic analysis 
10.2 FAULT INJECTION AND RECOVERY 
10.3 OPERATIONAL TESTING 
101 
102 
104 
106 
107 
107 
108 
109 
110 
111 
111 
111 
112 
112 
112 
113 
113 
113 
113 
114 
114 
114 
114 
114 
115 
115 
115 
115 
117 
- V l l l 
10.4 TESTING OF THE GOVERNOR CONTROLLER 118 
10.4.1 Pressure con t ro l 118 
10.4.2 Fault in jec t ion 119 
10.4.3 Voting e r ro rs 119 
10.4.4 RAM e r ro rs 120 
10.4.5 Watchdogs 120 
10.4.6 Snake 121 
10.4.7 Solenoid fa i lure 121 
10.4.8 Pressure t ransducers 121 
10.4.9 In te r fe rence test ing 121 
DISCUSSION 
LIST OF REFERENCES 
FIGURES 
TABLES 
APPENDIX 
1. 
2. 
3.1 
3.2 
4. 
5. 
6. 
7. 
8. 
9. 
10. 
Ci rcu i t d i ag ram of power supply 
C i rcu i t board layout 
Er ror f lags low pass f i l ter ca lcu la t ion 
FPLA p r o g r a m m i n g table 
Ca lcu la t ion of delay before la tch ing error f lags 
Encod ing ROM data 
Decod ing ROM data 
Operat ion of the real t ime c lock 
Design of redundant so lenoid driver c i rcu i t ry 
List of in tegrated c i rcu i ts 
Governor con t ro l le r sof tware l is t ings 
123 
129 
136 
174 
185 
185 
186 
190 
191 
192 
194 
195 
196 
197 
201 
202 
- I X 
LIST OF FIGURES 
F igure 
2 
4 
6 TMR r ing s t ruc tu re 
7 Graph of acce le ra t ion factor versus operat ing temperature for 
a re fe rence tempera tu re of 25°C. Plotted for d i f ferent 
act ivat ion energ ies . 
8 Fai lure rate of a 8085 m ic rop rocesso r versus case ambient 
t empera tu re 
9 BS9400 sc reen ing p rocedure 
10 Graph of s implex and TMR rel iabi l i ty versus normal ised 
miss ion t ime. 
Page 
1 Cross sec t ion of a Donkin FIG280 regulator 136 
Governor and d is t r ibut ion network 137 
Twin s t ream governor 137 
The b a t h - t u b curve 138 
5 TMR voter and e r ro r detec t ion in open co l lec tor TTL logic 138 
138 
139 
140 
141 
142 
11 Graph Of.MTTFIF versus rat io A i / A i 142 
12 Faul t t ree analys is for p ressure t r ip 143 
13 Sof tware recovery us ing recovery b locks 144 
14 Stepper motor con t ro l led regulator 
15 So leno id con t ro l l ed regulator 
16 Block d iag ram of non fau l t - to le ran t cont ro l le r 
17 O p - c o d e fetch t im ing showing synchron isat ion prob lem 
18 RST/HOLD t im ing 
19 Pho tograph of Governor Contro l ler 
20 Photograph of pneumat ic test r ig 
21 Block d iag ram of m i c rop rocesso r board 
22 M ic rop rocesso r and buf fers - CHANNEL 1 
23 M ic rop rocesso r and buf fers - CHANNEL 2 
24 M ic rop rocesso r and buf fers - CHANNEL 3 
25 Vot ing c i rcu i t ry 
26 FPLA er ror s ignal showing the ef fect of f i l ter ing 
145 
145 
146 
146 
146 
147 
147 
148 
149 
150 
151 
152 
153 
27 Resynchron isa t ion t iming moni to red by logic analyser 153 
28 Clock. Test, and RS232 in ter face 154 
29 Resynchron isa t ion c i rcu i t ry 155 
30 Watchdogs and reset c i rcu i t ry 156 
31 Block d iag ram of memory board 157 
32 Part of memory board inc lud ing RAM and decoders 158 
33 Detai l of RAM storage c i rcu i t ry inc lud ing test circuiti-y 159 
34 Instant ROM. EPROM. and PORT(RD) decoder c i rcu i t d iagram 160 
35 Block d iag ram of input /output board 161 
36 Input /output board 162 
37 Stepper motor and redundant so lenoid drive c i rcu i t ry 163 
38 Pressure t ransducer and cond i t ion ing c i rcu i t ry 164 
39 Block d iag ram of pneumat ic test r ig 165 
40 Flow char t of modu le MAIN 166 
41 Flow char t of modu le RESYNC 167 
42 Flow char t of modu le MERROR 168 
43 Flow char t of modu le MERROR (cont inued) 169 
44 Flow char t of modu le CNTRL 170 
45 Flow char t of modu le PRESSR 171 
46 Graph of governor out let pressure versus or i f ice plate 
d i f fe rent ia l 172 
47 Error messages 173 
X I 
LIST OF TABLES 
Table 
1 Fai lure mechan i sms and act ivat ion energ ies in MOS 
sem iconduc to r s 
Page 
174 
174 
175 
175 
175 
176 
2 A compar i son of res is tor fa i lure rates 
3 A compa r i son of capac i to r fa i lure rates 
4 A compar i son of so ldered jo ints fa i lure rates 
5 A compa r i son of w i r e - w r a p joint fa i lure rates 
6 A compa r i son of edge connec to r fa i lure rates 
7 Act ivat ion energ ies used in fa i lure rate pred ic t ion 176 
8 A compa r i son of TTL in tegrated c i rcu i t fa i lure rates 177 
9 A compa r i son of the fa i lure rates for a 6800 mic roprocessor 177 
10 A c o m p a r i s o n of 6800 m ic rop rocesso r adjusted fa i lure rates 
11 A compa r i son of the fa i lure rates for a 8080 mic roprocessor 
12 A compa r i son of 8080 m ic rop rocessor adjusted fai lure rates 
13 A c o m p a r i s o n of the fa i lu re rates of a 2716 EPROM 179 
14 A compa r i son of the fa i lure rates of a I k Bipolar ROM 180 
15 A compa r i son of the fa i lure rates of a 16k Dynamic RAM 180 
16 A compa r i son of the fa i lure rates of a i k Static RAM 181 
17 The re la t ion between cos t , sc reen ing level , and Quality factor 181 
18 Recommended max imum junc t ion temperatures for semiconductors 182 
19 Signals ca r r i ed by back -p l ane bus 
20 Cor rec t i on bits for SEC/DED Hamming code 
21 Er ro r pos i t ion as ind ica ted by e r ro r f lags 
178 
178 
179 
183 
184 
184 
X l l -
List Of Symbols 
ASCII 
CMOS 
DAG 
DBE 
di l 
ECL 
EEC 
EMP 
EPROM 
FMECA 
FPLA 
F T A 
f /M hrs 
I C E . 
\ 
LSI 
LSTTL 
MOS 
MTBF 
MTFF 
MTTF 
MTTFIF 
MTTR 
NASA 
POR 
PROM 
psi 
RAM 
rh 
A m e r i c a n Standard Code for Informat ion In terchange 
Comp lementa ry Metal Oxide Semiconductor 
Demand Act ivated Govern ing 
Double Bit Er ror 
dual in l ine 
Emitter coup led logic 
European Economic Communi ty 
E lec t romagne t i c Pulse 
Erasable P rog rammab le Read Only Memory 
Fault Mode and Effect Cri t ical i ty Analysis 
Fie ld P rog rammab le Logic Array 
Fault Tree Analys is 
fa i lu res per mi l l ion hours 
fa i lure rate 
Large Sca le Integrat ion 
Low Power Shottky Transistor Transistor Logic 
Metal Oxide Semiconduc to r 
Mean T ime Between Fai lures 
Mean T ime to First Fai lure 
Mean T ime To Fai lure 
Mean Time To Fai lure Improvement Factor 
Mean T ime To Repair 
North Atvet(£X((/v Space Admin is t ra t ion 
Power On Reset 
P rog rammab le Read Only Memory 
pounds per square inch 
Random Access Memory 
relat ive humidi ty 
- X l l l -
ROM Read Only Memory 
RTC Real T ime Clock 
SEC/DED Sing le Bit Error Cor rec t ion / Double Bit Error Detect ion 
SSI Smal l Scale In tegrat ion 
STTL Shottky Trans is tor Transis tor Logic 
THB Tempera tu re Humidi ty Bias 
TMR Tr ip le Modular Redundancy 
TTL Trans is tor Trans is tor Logic 
UART Universal Asynchronous Receiver Transmit ter 
VDU Visual Display Unit 
VLSI Very Large Scale Integrat ion 
"wg inches water guage 
- 1 
C H A P T E R 1 
T H E N E E D F O R A R E L I A B L E C O N T R O L L E R 
1.1 N A T I O N A L G A S N E T W O R K 
in i h e d a y s b e f o r e t h e e x t r a c t i o n of n a t u r a l g a s . g a s w a s p r o d u c e d , 
s i o r e o , t r a n s m i t t e d , a n d u s e d l o c a l l y . G a s w a s t r a n s m i t t e d at low p r e s s u r e 
a n d t h e r e v /as n o n a t i o n a l n e t w o r k w h i c h m a d e t h e c o n t r o l p r o b l e m m u c h 
e a s i e r . 
T h e c h a n g e o v e r to n a t u r a l g a s i n v o l v e d t h e i n s t a l l a t i o n of a n a t i o n a l 
n e t w o r k . G a s c o m e s a s h o r e f r o m s e v e r a l g a s f i e l d s at h i g h p r e s s u r e a n d is 
t r a n s m i t t e d t h r o u g h o u t t h e c o u n t r y at h i g h p r e s s u r e , t y p i c a l l y at 1000 p s i 
i n f o u r f o o l d i a m e t e r p i p e s . C o m p r e s s o r s t a t i o n s a r e u s e d a t r e g u l a r 
i n t e r v a l s to o v e r c o m e p r e s s u r e d r o p s in t h e s y s t e m . B e c a u s e of t h e 
e x t r e m e l y l a r g e v o l u m e of g a s b e i n g c o n t r o l l e d a t h 'Qh p r e s s u r e a n d t h e 
a b s o l u t e n e e d f o r r e l i a b l e c o n t r o l , c o n s i d e r a b l e e x p e n d i t u r e o n fau l t 
t o l e r a n t m a i n f r a m e a n d m i n i - c o m p u t e r s f o r c o n t r o l p u r p o s e s c a n b e 
j u s t i f i e d . 
G a s is p r o g r e s s i v e l y r e d u c e d in p r e s s u r e a s it is t r a n s m i t t e d to t h e 
c o n s u m e r . T r a n s m i s s i o n p r e s s u r e s r a n g e f r o m lOOOpsi to lOOps i a n d c o n t r o l 
of g a s at t h e b o t t o m e n d o f t h i s r a n g e is p e r f o r m e d by p n e u m a t i c s . in 
o r d e r 10 m a k e f u l l u s e o f t h e g a s n e t w o r k , t h e t e c h n i q u e of l i n e - p a c k h a s 
b e e n d e v e l o p e d , w h e r e t h e p i p e w o r k i t se l f is u s e d to s t o r e l a r g e q u a n t i t i e s 
of g a s in o r d e r to s m o o t h o u t v a r i a t i o n s in d e m a n d . To o p t i m i s e t h e 
e f f i c i e n c y of t e c h n i q u e s s u c h a s l i n e - p a c k , it is n e c e s s a r y to r e p l a c e 
p n e u m a t i c c o n t r o l l e r s w i t h e l e c t r o n i c d i g i t a l c o n t r o l l e r s . T h e s e 
c o n t r o l l e r s m u s t be e x t r e m e l y r e l i a b l e b e c a u s e of t h e r e q u i r e m e n t fo r s a f e 
o p e r a t i o n o f t h e n e t w o r k , a n d f a i l u r e t o s u p p l y g a s c o u l d i n v o l v e B r i t i s h 
G a s in l i t i g a t i o n r e s u l t i n g in t h e p a y m e n t of c o n s i d e r a b l e c o m p e n s a t i o n . 
F a u l t t o l e r a n t c o n t r o l l e r s c a n a c h i e v e t h e l e v e l s of r e l i a b i l i t y r e q u i r e d 
- 2 -
a n d m a n y of t h e t e c h n i q u e s d e s c r i b e d a n d d e m o n s t r a t e d in t h e f o l l o w i n g 
c h a p t e r s c o u l d b e u s e d . A l t h o u g h t h e c o s t of t h e c o n t r o l l e r m u s t be 
c o n s i d e r e d m o r e c a r e f u l l y a t l ow p r e s s u r e s , s u f f i c i e n t s a v i n g s c a n be m a d e 
by t h e m o r e e f f i c i e n t o p e r a t i o n of t h e n e t w o r k to j u s t i f y s u c h c o n t r o l l e r s . 
G a s l i n a l l y r e a c h e s t h e " d i s t r i b u t i o n " n e t w o r k a t r e l a t i v e l y l ow 
p r e s s u r e s w h e r e t h e r e a r e s e v e r a l t h o u s a n d g o v e r n o r . s t a t i o n s . T h e q u a n t i t y 
o f g a s h a n d l e d by e a c h g o v e r n o r is m u c h l e s s t h a n a t h i g h e r p r e s s u r e s a n d 
a l l c o n t r o l i s p e r f o r m e d i n p n e u m a t i c s . A l a r g e g o v e r n o r s t a t i o n m i g h t 
f e e d t w o t h o u s a n d c o n s u m e r s . T h e c o s t of e l e c t r o n i c c o n t r o l at t h i s l eve l 
is c r i t i c a l , h o w e v e r t h e c o n t r o l l e r m u s t s t i l l be v e r y r e l i a b l e w h i c h 
r e q u i r e s f a u l t t o l e r a n t t e c h n i q u e s to be u s e d . It is t h i s t ype of " s m a l l 
d i g i t a l c o n t r o l l e r " , t h a t h a s b e e n s t u d i e d in d e p t h at D u r h a m a n d a 
p r o t o t y p e c o n t r o l l e r d e v e l o p e d . T h e c o n t r o l l e r d e v e l o p e d is m o r e p o w e r f u l 
t h a n t h a t r e q u i r e d f o r s i m p l e p r e s s u r e / f l o w c o n t r o l of a s i n g l e r e g u l a t o r 
a n d c o u l d be e x p a n d e d to c o n t r o l l i n e - p a c k as d e s c r i b e d a b o v e . 
A s w e l l as e x p e r i m e n t i n g w i t h h y b r i d e l e c t r o n i c / m e c h a n i c a l g o v e r n o r s . 
B r i t i s h G a s a r e m a k i n g i n c r e a s i n g u s e o f m i c r o p r o c e s s o r e q u i p m e n t . B e c a u s e 
o f t h e r e q u i r e m e n t f o r h i g h r e l i a b i l i t y w h e n c o n t r o l l i n g g a s . it is 
e s s e n t i a l to c o n s i d e r t h e r e l i a b i l i t y of e v e r y p i e c e of e q u i p m e n t a n d 
i n t r o d u c e f a u l t t o l e r a n t t e c h n i q u e s if n e c e s s a r y . 
1.2 P N E U M A T I C G O V E R N O R S 
S i n c e t h e a i m of t h i s r e s e a r c h w a s t o d e v e l o p a " s m a l l d i g i t a l 
c o n t r o l l e r " to b e u s e d a s p a r t of a n e l e c t r o m e c h a n i c a l g o v e r n o r , it is 
n e c e s s a r y to s t u d y t h e c h a r a c t e r i s t i c s of p n e u m a t i c g o v e r n o r s so t h a t t h e 
r e p l a c e m e n t is n o w o r s e t h a n i ts p r e d e c e s s o r in p e r f o r m a n c e a n d c o s t . 
T h e t e r m s r e g u l a t o r a n d g o v e r n o r a r e v i r t u a l l y s y n o n y m o u s , bu t t h e 
t e r m g o v e r n o r w i l l b e u s e d to d e s c r i b e a s y s t e m h a v i n g a n i n l e t a n d a 
c o n t r o l l e d o u t l e t . T h e t e r m r e g u l a t o r w i l l be u s e d to d e s c r i b e t h e 
3 -
i n d i v i d u a l r e g u l a t o r v a l v e s w h i c h c o m p r i s e t h e s y s t e m . 
P n e u m a t i c r e g u l a t o r s h a v e b e e n u s e d by B r i t i s h G a s f o r m a n y y e a r s . 
T h e y h a v e m a n y a d v a n t a g e s , of w h i c h t h e m a j o r o n e s a r e l i s t e d b e l o w : 
(i) T r i e d a n d t r u s t e d - F o r a m e c h a n i c a l d e v i c e t h e y a r e v e r y 
r e l i a b l e . T h i s is p a r t l y d u e to t h e i r m a t u r e t e c h n o l o g y a n d t h e 
vas t q u a n t i t y p r o d u c e d . 
( i i ) I n e x p e n s i v e - A t y p i c a l r e g u l a t o r c o s t s a f e w h u n d r e d p o u n d s . 
( i i i ) T h e y a r e s e l f p o w e r e d a n d n e e d n o m a i n s s u p p l y . 
( iv ) T h e y c a n w i t h s t a n d s o m e u n u s u a l o p e r a t i n g c o n d i t i o n s a n d 
t r a n s i e n t s u r g e s . 
(v) M e c h a n i c a l l y t h e y a r e v e r y r u g g e d a n d w i l l o p e r a t e ove r a w i d e 
t e m p e r a t u r e r a n g e a s l o n g a s t h e y a r e p r e v e n t e d f r o m f r e e z i n g up . 
A t y p i c a l r e g u l a t o r v a l v e is s h o w n in f i g u r e d ) . T h e va lve w h i c h 
c o n t r o l s t h e f l o w of g a s is m o u n t e d w i t h i n t h e l o w e r c a s t i r o n body . T h e 
v a l v e o r i f i c e is m a d e f r o m s t a i n l e s s s t e e l to m i n i m i s e w e a r a n d c o r r o s i o n 
a n d t h e v a l v e s e a t is m a d e f r o m n i t r i l e r u b b e r to e n s u r e low l e a k a g e w h e n 
s h u t . T h e v a l v e is a t t a c h e d to a l o n g s t e m w h i c h p a s s e s up i n to t h e 
d i a p h r a g m c h a m b e r a n d is a t t a c h e d to t h e c e n t r e of t h e d i a p h r a g m . M o v e m e n t 
of t h e d i a p h r a g m c a u s e s t h e v a l v e to o p e n a n d s h u t . T h e d i a p h r a g m 
e q u i l i b r i u m p o s i t i o n is r e a c h e d by b a l a n c i n g t h e f o r c e of t h e s p r i n g o n t h e 
t o p of t h e d i a p h r a g m w i t h t h e g a s p r e s s u r e b e l o w . D o w n s t r e a m g a s p r e s s u r e 
is f e d to t h e p r e s s u r e t a p p i n g in t h e b o t t o m d i a p h r a g m b o w l w h i c h is s h o w n 
t o t h e r i g h t of f i g u r e d ) . If t h e d o w n s t r e a m g a s p r e s s u r e r i s e s , t h i s 
c a u s e s t h e d i a p h r a g m to m o v e u p w h i c h c l o s e s t h e va l ve a n d r e d u c e s t h e 
d o w n s t r e a m p r e s s u r e . T h e r e g u l a t o r is t h u s s e e n to b e h a v e as a n e g a t i v e 
f e e d b a c k c l o s e d l o o p c o n t r o l l e r of p r e s s u r e . T h e o u t l e t p r e s s u r e of t h e 
r e g u l a t o r is se t by t h e m a i n s p r i n g w h o s e c o m p r e s s i o n c a n be v a r i e d by 
a d j u s t i n g t h e t o p s c r e w e d p l u g . T h e v a l v e s h o w n is a n " o p e n at r e s t " t y p e 
a n d r u p t u r i n g of t h e d i a p h r a g m w i l j c a u s e t h e va l ve to fa i l f u l l y o p e n . 
- 4 -
1.3 L I M I T A T I O N S OF P N E U M A T I C G O V E R N O R S 
A s c h e m a t i c d i a g r a m of a g o v e r n o r f e e d i n g a d i s t r i b u t i o n n e t w o r k is 
s h o w n in f i g u r e ( 2 ) . T h e r e w i l l be s o m e p o i n t in t h e n e t w o r k w h e r e t h e 
l a r g e s t p r e s s u r e d r o p e x i s t s b e t w e e n t h a t p o i n t a n d t h e g o v e r n o r . T h i s 
p o i n t is l a b e l l e d t h e w o r s t c a s e p r e s s u r e p o i n t . T o c o m p l i c a t e m a t t e r s 
f u r t h e r , t h i s p o i n t m o v e s a r o u n d a s t h e l o a d i n g o n t h e n e t w o r k c h a n g e s . 
T h e r e is a s t a t u t o r y m i n i m u m p r e s s u r e ( a b o u t 5 i n c h e s w g ) at w h i c h 
c o n s u m e r s m u s t b e s u p p l i e d , w h i c h m e a n s t h a t t h e g o v e r n o r o u t l e t p r e s s u r e 
m u s t a l w a y s b e h i g h e n o u g h to g i v e t h i s m i n i m u m p r e s s u r e at t h e w o r s t c a s e 
p r e s s u r e p o i n t . A l l o t h e r c o n s u m e r s w i l l r e c e i v e g a s at p r e s s u r e s a b o v e 
t h e s t a t u t o r y m i n i m u m . 
It i s d e s i r a b l e t o k e e p t h e o u t l e t p r e s s u r e of t h e g o v e r n o r as l ow a s 
p o s s i b l e w h i l s t s t i l l s a t i s f y i n g t h e a b o v e c o n d i t i o n s b e c a u s e of t h e 
p r o b l e m of l e a k a g e . OHi^ cU.i^L^hou p;p^i^<yjz }s sui<<zcf 4^ •Klaiiuel^^ou^ ieoeUe^-
kaja^il^s, leAkcK^e' 'i^Uck is p r o p o r t i o n a l to t h e n e t w o r k o v e r - p r e s s u r e . U n d e r 
n o n w o r s t c a s e c o n d i t i o n s a n d p e r i o d s of low d e m a n d , t h e r e w i l l b e 
c o n s i d e r a b l e o v e r - p r e s s u r e u n l e s s t h e g o v e r n o r is r e a d j u s t e d . it h a s b e e n 
e s t i m a t e d by M u r p h y [ 2 3 ) t h a t CG<s;cL^Ue--'S:,vii^ iu rfi^e..calue^ (!| (osf-^ns c o u l d 
b e m a d e f o r a 2 " w g r e d u c t i o n in n a t i o n w i d e s y s t e m p r e s s u r e . 
^ - • U s i n g m i c r o p r o c e s s o r t e c h n i q u e s , it s h o u l d be p o s s i b l e to 
r e d u c e n e t w o r k o v e r - p r e s s u r e by a t l e a s t 2 " w g . 
A s w e l l as r e s i s t i v e p r e s s u r e d r o p s in t h e n e t w o r k t h e r e is a v a r i a b l e 
p r e s s u r e d r o p a c r o s s a s i m p l e p n e u m a t i c r e g u l a t o r . T h e o u t l e t p r e s s u r e of 
a t y p i c a l r e g u l a t o r d r o p s by 15 p e r c e n t a s t h e f l ow t h r o u g h it is v a r i e d 
f r o m z e r o to f u l l r a t e d - t h i s is c a l l e d " d r o o p " . C o n s e q u e n t l y t h e 
r e g u l a t o r se t p o i n t m u s t be s e t h i g h e r t h a n r e q u i r e d to c o m p e n s a t e fo r 
d r o o p at h i g h flowr>. A m i c r o p r o c e s s o r c o n t r o l l e d va l ve c o u l d r e d u c e the 
d r o o p 10 .^ero. 
It is n e c e s s a r y to se t t h e g o v e r n o r o u t l e t p r e s s u r e h i g h e r t h a n tha t 
- 5 -
r e q u i r e d in a " t w i n s t r e a m " g o v e r n o r . In l a r g e g o v e r n o r s t a t i o n s w h e r e it 
is i m p e r a t i v e to s u p p l y g a s w i t h o u t f a i l u r e , it is c o m m o n to u s e a t w i n 
s t r e a m r e g u l a t o r c o n f i g u r a t i o n as s h o w n in f i g u r e O ) . S l a m s h u t s a r e 
s i m i l a r to r e g u l a t o r v a l v e s , bu t a r e d e s i g n e d to c l o s e r a p i d l y , c u t t i n g o f f 
t h e g a s f l o w , w h e n t h e i r s e t p r e s s u r e is r e a c h e d . T y p i c a l p r e s s u r e 
s e t t i n g s a r e s h o w n in f i g u r e O ) . T h e m i n i m u m a c c e p t a b l e p r e s s u r e at t h e 
g o v e r n o r o u t l e t is 1 2 " w g , s o t h e s t a n d b y r e g u l a t o r is se t to g i v e t h i s . 
N o r m a l l y t h e s t a n d b y r e g u l a t o r is h e l d off a n d t h e o u t l e t p r e s s u r e is 
c o n t r o l l e d a t . 1 4 " w g by t h e a c t i v e s t r e a m r e g u l a t o r . If t h e o u t l e t p r e s s u r e 
r i s e s a b o v e 1 8 " w g , t h e a c t i v e s t r e a m is c u t o f f by t h e s l a m s h u t se t a t 
1 8 " w g a n d t h e s t a n d b y s t r e a m t a k e s o v e r . If t h e p r e s s u r e r i s e s f u r t h e r to 
2 0 " w g , t h e s t a n d b y s t r e a m is c u t o f f by the s l a m s h u t se t to 2 0 " w g a n d g a s 
f l o w t h r o u g h t h e g o v e r n o r is c u t of f . It is d i f f i c u l t to se t a c c u r a t e l y 
b o t h r e g u l a t o r s a n d s l a m s h u t s , s o it is n e c e s s a r y to have t he i r s e t t i n g s 
s e v e r a l i n c h e s w g a p a r t s o t h a t t h e y d o n o t i n t e r a c t w i t h e a c h o t h e r . if 
a l l t h e v a l v e s a n d s l a m s h u t s w e r e c o n t r o l l e d by a m i c r o p r o c e s s o r , t h e n it 
w o u l d b e p o s s i b l e to h a v e t h e i r s e t p o i n t s c l o s e r t o g e t h e r . 
To c o m p e n s a t e f o r r e s i s t i v e p r e s s u r e d r o p s in t h e s y s t e m , it is 
d e s i r a b l e to r e d u c e t h e g o v e r n o r p r e s s u r e u n d e r c o n d i t i o n s of low f l ow a n d 
to i n c r e a s e t h e p r e s s u r e u n d e r c o n d i t i o n s of h i g h f l ow . S u c h c o n t r o l is 
c a l l e d " d e m a n d a c t i v a t e d g o v e r n i n g " ( D A G ) . A p n e u m a t i c g o v e r n o r h a s b e e n 
d e v e l o p e d to p e r f o r m t h i s c o n t r o l f u n c t i o n , bu t is v e r y c o m p l e x a n d 
r e q u i r e s m a n y r e g u l a t o r v a l v e s a n d p n e u m a t i c c o m p o n e n t s . if a m u l t i - f e e d 
n e t w o r k is f e d by s e v e r a l D A G c o n t r o l l e r s , t h e n t h e c o n t r o l l e r s m a y 
i n t e r r a c t a n d m a k e t h e s y s t e m u n s t a b l e . To p r e v e n t th is it is n e c e s s a r y 
f o r t h e c o n t r o i l o r r , to c o m m u n i c a t e w i t h e a c h o t h e r so t h a i i n t e r a c t i o n is 
c o n t r o l l e d . 
T h e p r e v i o u s d i s c u s s i o n h a s s h o w n t h a t p n e u m a t i c r e g u l a t o r s a r e 
s a t i s f a c t o r y in t h e i r s i n g l e f o r m , bu t a s s o o n a s s e v e r a l a r e c o n n e c t e d 
t o g e t h e r to f o r m a g o v e r n o r , p e n a l t i e s m u s t b e p a i d by w a y of c o m p l e x i t y . 
- 6 -
i n c r e a s e d n e t w o r k l e a k a g e a n d i n e f f i c i e n t c o n t r o l of t h e n e t w o r k . M o s t of 
t h e d i s a d v a n t a g e s of p n e u m a t i c g o v e r n o r s c a n be o v e r c o m e by u s i n g a 
p n e u m a t i c / e l e c t r o n i c h y b r i d g o v e r n o r a n d s u c h t a s k s a s t e l e m e t r y , h e a l t h 
m o n i t o r i n g a n d c o m p l e x c o n t r o l a l g o r i t h m s a r e m a d e m u c h e a s i e r t o 
i m p l e m e n t . It is h o w e v e r e s s e n t i a l t h a t t h e e l e c t r o n i c c o n t r o l l e r is 
e x t r e m e l y r e l i a b l e , e s p e c i a l l y i n c i r c u m s t a n c e s w h e r e t h e c o n t r o l l e r is 
c o n t r o l l i n g s e v e r a l v a l v e s . It is n e c e s s a r y to u s e fau l t t o l e r a n t 
t e c h n i q u e s to a c h i e v e s u c h a h i g h l eve l of r e l i a b i l i t y . 
- 7 -
C H A P T E R 2 
A REVIEW OF O T H E R F A U L T - T O L E R A N T C O N T R O L L E R S 
2.1 S M A L L . C; 0 N l i j p J J - E RS 
T h e a p p l i c a t i o n of r e d u n d a n c y to s m a l l c o n t r o l l e r s is o f t e n m o r e c o s t 
s e n s i t i v e t h a n l a r g e c o m p u t e r s . R e d u n d a n c y c a n be u s e d to a c h i e v e n o n s t o p 
p r o c e s s i n g , t a i l s a f e o p e r a t i o n , a n d a r e d u c t i o n i n t h e m e a n t i m e to r e p a i r 
( M T T R ) by t h e u s e of bu i l t in d i a g n o s t i c s . 
2 .1 .1 T M R c o n t r o l l e r s 
P i a t i e t e r f Z ] d e s c r i b e s a T M R d e s i g n u s i n g t h r e e 8 0 8 5 m i c r o p r o c e s s o r s 
w h i c h is s i m i l a r to t h e e x p e r i m e n t a l c o n t r o l l e r d e s c r i b e d in c h a p t e r e i g h t . 
P l a t t e t e r r e c o m m e n d s t h e u s e of T M R fo r t h e p r o t e c t i o n a g a i n s t u n t e s t a b l e 
e r r o r s in m i c r o p r o c e s s o r s . T h r e e d i f f e r e n t m a n u f a c t u r e r s ' 8 0 8 5 s a r e u s e d , 
e a c h h a v i n g a d i f f e r e n t i n d e p e n d e n t d e s i g n . He p r o p o s e s tha t it is b e t t e r 
t o a c c e p t t h a t c o m p l e x VLSI c i r c u i t s s u c h a s m i c r o p r o c e s s o r s w i l l c o n t a i n 
u n t e s t a b l e f a u l t s , a n d it is b e t t e r to p r o t e c t a g a i n s t t h i s u s i n g T M R 
t e c h n i q u e s ^ t h a n to e l i m i n a t e a l l m a n u f a c t u r i n g f a u l t s . T h e v o t i n g is 
p e r f o r m e d at b u s l e v e l in S T T L i n t e g r a t e d c i r c u i t s w h i c h w i l l be l e s s 
r e l i a b l e t h a n v o t i n g in F P L A s . A v e r y u n s a t i s f a c t o r y a p p r o a c h is t a k e n 
t o w a r d s r e s y n c h r o n l s a t i o n . E i t h e r r e s y n c h r o n i s a t l o n " j us t h a p p e n s " o r it 
is a i d e d by P U S H i n g a l l r e g i s t e r s o n t o t h e s t a c k a n d t h e n P O P i n g t h e m off . 
H e r e p o r t s t h a t s o m e t i m e s t h e s y s t e m w o u l d n o t r e s y n c h r o n i s e a n d if w a s 
n e c e s s a r y to p e r f o r m a r e s e t . T h e r e a s o n s f o r t h e s y s t e m n o t s y n c h r o n i s i n g 
a n d h o w t h i s c a n be o v e r c o m e is d e s c r i b e d in c h a p t e r e i g h t . 
H i g u c h i e t a l (22 ] d e s c r i b e a T M R s y s t e m u s i n g t h r e e 8 0 8 5 
m i c r o p r o c e s s o r s w h i c h is r e s y n c h r o n i s e d at r e g u l a r i n t e r v a l s . T h i s is 
u n e c e s s a r y s i n c e it is b e t t e r o n l y t o r e s y n c h r o n i s e w h e n a v o t i n g e r r o r is 
d e t e c t e d . R e s y n c h r o n i s a t i o n is p e r f o r m e d by P U S H i n g a l l r e g i s t e r s o n t o t h e 
s t a c k a n d t h e n P O P i n g t h e m of f . T h i s w i l l n o t a l w a y s r e s y n c h r o n i s e t h e 
- 8 -
p r o g r a m c o u n t e r a n d s o m e t i m e s o n e p r o c e s s o r w i l l b e f o u n d to l o c k o n e c l o c k 
c y c l e b e h i n d t h e o t h e r t w o a s d e s c r i b e d in c h a p t e r e i g h t . T h e s e 
d i f f i c u l t i e s a r e n o t r e p o r t e d . A n a l t e r n a t i v e T M R c o n f i g u r a t i o n is 
p r o p o s e d by H i g u c h i w h i c h u s e s s o f t w a r e v o t i n g a n d t h e t h r e e p r o c e s s o r s 
e x e c u t e t a s k s at s t a g g e r e d i n t e r v a l s a n d c o m p a r e t h e i r r e s u l t s w h e n a l l 
p r o c e s s o r s h a v e e x e c u t e d t h e t ask . T h e s o f t w a r e w i l l be m o r e c o m p l e x a n d 
v o t i n g w i l l n o t b e t r a n s p a r e n t to t h e u s e r , bu t t h e e x e c u t i o n of t a s k s at 
d i f f e r e n t t i m e s s h o u l d r e d u c e t h e e f f e c t of t r a n s i e n t e r r o r s . 
R y l a n d f201 d e s c r i b e s t h e u s e of m i c r o p r o c e s s o r s in r e l i a b l e r a i l w a y 
s i g n a l l i n g e q u i p m e n t . T h e i n t e r l o c k i n g s y s t e m u s e s t h r e e 6 8 0 0 
m i c r o p r o c e s s o r s in a T M R c o n f i g u r a t i o n . T h e p r o c e s s o r s a r e l o o s e l y 
s y n c h r o n i s e d a n d a p r o c e s s o r w i l l a t t e m p t to s h u t d o w n a n y p r o c e s s o r w i t h 
w h i c h it d i s a g r e e s ( a s s a s s i n a t i o n s If t h e a s s a s s i n a t i o n a t t e m p t f a i l s , it 
w i l l s h u t i t se l f d o w n ( s u i c i d e ) . F a u l t r e p o r t i n g a n d t h e r e p a i r of a 
f a u l t y c h a n n e l is p e r f o r m e d o n - l i n e . T w o s e p a r a t e d a t a l i nks a r e p r o v i d e d 
t o t h e e q u i p m e n t a n d d a t a is t r a n s m i t t e d at 10k b a u d in M a n c h e s t e r c o d e 
w i t h H a m m i n g c o d e p r o t e c t i o n . T h e s o f t w a r e w a s w r i t t e n in a s s e m b l e r 
b e c a u s e t h e d e s i g n e r fe l t t h a t it w a s e a s i e r to a c h i e v e r e l i a b i l i t y b e c a u s e 
n o a r i t h m e t i c w a s u s e d a n d t h e h a r d w a r e i n t e r f a c e w a s s i m p l e . 
D a v i e s e t . a l [211 d i s c u s s r i n g c o m m u n i c a t i o n s t r u c t u r e s f o r a s m a l l 
s y s t e m . T h r e e 8 7 4 8 s i n g l e c h i p m i c r o p r o c e s s o r s a r e c o n n e c t e d in a T M R r i n g 
s t r u c t u r e a n d t h e s y n c h r o n i s a t i o n is p e r f o r m e d in s o f t w a r e a n d is 
a c c o m p l i s h e d by h a n d s h a k i n g b e t w e e n t h e p r o c e s s o r s or t h e i n s e r t i o n of a 
f i xed d e l a y b e f o r e v o t i n g . T h e o u t p u t s f r o m t h e t h r e e m i c r o p r o c e s s o r s m u s t 
b e c o m b i n e d w h i c h is n e a r l y a s c o m p l e x as v o t i n g in h a r d w a r e , bu t t h e y 
p r o p o s e t h a t s o f t w a r e v o t i n g o f f e r s g r e a t e r v e r s a t i l i t y a n d is no t p r o n e to 
c o m p o n e n t f a i l u r e . 
2 J 2 Fa i l s a f e c o n t r o l l e r 
A l i f e s u p p o r t s y s t e m c o n t r o l l e r is d e s c r i b e d by L im 14]. T h e d e s i g n 
- 9 -
is d e l i b e r a t e l y n o t f a u l t - t o l e r a n t b e c a u s e of t h e i n c r e a s e d c o s t . Dut is 
d e s i g n e d to d e t e c t h a r d w a r e a n d s o f t w a r e f a u l t s a n d to c a u s e t h e s y s t e m to 
f a i l s a f e . A n a l a r m is g i v e n o n f a i l u r e of t h e c o n t r o l l e r so tha t h u m a n 
o p e r a t i o n w i l l e n s u r e c o n t i n u e d o p e r a t i o n of t h e l i fe s u p p o r t s y s t e m . 
2 .1 .3 B u i l t in t e s t e q u i p m e n t 
F o o s e [ 1 7 ] d e s c r i b e s a n i n d u s t r i a l m i c r o p r o c e s s o r c o n t r o l l e r w h i c h w a s 
d e s i g n e d to r e d u c e t h e M T T R . T h e d e s i g n a i m w a s 8 0 % a u t o m a t i c t e s t a b i l i t y 
f o r 1 0 % e x t r a c o s t . T h e c o n t r o l l e r is d i v i d e d i n t o s e p a r a t e m o d u l e s , e a c h 
p e r f o r m i n g a s i n g l e f u n c t i o n , a n d m o d u l e s a r e d e s i g n e d to tes t t h e m s e l v e s . 
By r e d u c i n g t h e M T T R t h e a v a i l a b i l i t y of t h e c o n t r o l l e r is i n c r e a s e d a n d 
m a i n t e n a n c e c o s t s r e d u c e d . F a i l u r e d i a g n o s i s c a n a l s o be p e r f o r m e d by 
u n s k i l l e d o p e r a t o r s : 
2 .2 L A R G E C O M P U T E R S 
T h e m a j o r i t y of r e s e a r c h s o f a r h a s b e e n c o n c e n t r a t e d o n l a r g e f a u l t -
l o i e r a n t c o m p u t e r s w h i c h a r e u s e d in t h e s p a c e , a v i o n i c s , a n d n u c l e a r 
i n d u s t r i e s . 
2 . 2 .1 S p a c e s h u t t l e a n d a v i o n i c s 
T h e s p a c e s h u t t l e is p r o b a b l y o n e of t h e bes t ( a n d m o s t e x p e n s i v e ) 
e x a m p l e s of a f a u l t - t o l e r a n t s y s t e m . R e f e r e n c e s [ l 2 , 1 3 ] d e s c r i b e t h e 
d i g i t a l p r o c e s s i n g s u b s y s t e m . N A S A d e c i d e d to u s e s t a n d a r d p r o v e n a v i o n i c s 
c o m p u t e r s w h i c h n o w a d a y s a r e r a t h e r o u t d a t e d a n d to i m p l e m e n t t h e 
r e d u n d a n c y in s o f t w a r e . F i ve i d e n t i c a l IBM a v i o n i c s 3 2 bi t c o m p u t e r s a r e 
u s e d e a c h h a v i n g 2 5 0 k b y t e s of m e m o r y . T h e c o m p u t e r s s h a r e two 16M by te 
t a p e d r i v e s f o r m a s s s t o r a g e . T h e c o m p u t e r c o n t a i n s bu i l t in t e s t 
e q u i p m e n t a n d c a n d e t e c t 9 8 % of e r r o r s T h e c o m p u t e r s a r e i n t e r c o n n e c t e d 
to t h e m s e l v e s a n d to t h e s e n s o r s a n d a c t u a t o r s by s e r i a l b u s e s . T h e 
h i g h e s t r e l i a b i l i t y c o n f i g u r a t i o n c o n s i s t s of f o u r c o m p u t e r s in l o o s e 
- 10 -
s y n c h r o n i s m w i t h t h e f i f t h c o m p u t e r e x e c u t i n g b a c k g r o u n d t a s k s . 
S y n c h r o n i s m is a c h i e v e d by h a r d w a r e a n d s o f t w a r e a n d two c o m p u t e r s o u t of 
t h e f o u r c a n fa i l w i t h o u t a c a t a s t r o p h i c s y s t e m f a i l u r e . T h e i n e r t i a i 
g u i d a n c e p l a t f o r m is t r i p l i c a t e d a n d c o n n e c t e d to d i f f e r e n t b u s e s . T h e 
i n e r t i a i i n f o r m a t i o n u n d e r g o e s s e l e c t i o n f i l t e r i n g w h i c h c o m p a r e s t h e 
i n f o r m a t i o n c h a n n e l s a n d r e j e c t s a n y t h a t a r e a b o v e a p r e d e t e r m i n e d t r i p 
l e v e l . W i t h a l l c h a n n e l s f u n c t i o n a l , m i d - v a l u e s e l e c t i o n is u s e d w h i c h is 
c o m m o n fo r i n e r t i a i s y s t e m s a n d f a i l u r e o f c h a n n e l s r e s u l t s in a d e g r a d e d 
s e l e c t i o n a l g o r i t h m . F a i l u r e of a c h a n n e l m u s t n o t c a u s e a l a r g e t r a n s i e n t 
d i s t u r b a n c e at t h e o u t p u t of t h e s e l e c t i o n f i l t e r , s i n c e a f au l t l a s t i n g 
o n l y o n e s e c o n d is e n o u g h u n d e r c e r t a i n c o n d i t i o n s to c r a s h t h e s h u t t l e . 
T h e c y c l e t i m e of t h e c o n t r o l p r o c e s s is 4 0 m s in w h i c h t i m e s e l e c t i o n 
f i l t e r i n g a n d c o n t r o l of a l l t h e a e r o d y n a m i c s u r f a c e s a n d r o c k e t s is 
p e r f o r m e d . T h e c y c l e t i m e of 4 0 m s is n e c e s s a r y fo r t h e s t a b i l i t y of t h e 
s h u t t l e . 
A h e r n e t a l (14) d e s c r i b e a s o f t w a r e v o t e r / m o n i t o r fo r s e l e c t i o n 
f i l t e r i n g of i n e r t i a i g u i d a n c e i n f o r m a t i o n . It is n e c e s s a r y to r e j e c t 
f a u l t y c h a n n e l s a n d t h e n to p e r f o r m t h e v o t e . M i d - v a l u e s e l e c t i o n is a g a i n 
c h o s e n a s t h e b e s t a l g o r i t h m a n d t h e r e q u i r e m e n t to s m o o t h e t h e s w i t c h - o v e r 
f r o m o n e s e l e c t i o n a l g o r i t h m to a n o t h e r is s t r e s s e d to a v o i d t r a n s i e n t 
d i s t u r b a n c e s . 
M o d e r n m i l i t a r y a i r c r a f t a r e c o n t r o l l e d by t h e f l y - b y - w i r e t e c h n i q u e . 
A d i g i t a l f l i g h t c o m p u t e r r e a d s In i n e r t i a i a n d p i l o t i n f o r m a t i o n a n d 
c o n t r o l s e l e c t r o m e c h a n i c a l a c t u a t o r s w h i c h a r e c o n n e c t e d to the c o n t r o l 
s u r f a c e s . M e c h a n i c a l a n d h y d r a u l i c l i n k a g e s a r e r e p l a c e d , bu t it i s 
n e c e s s a r y to u s e r e d u n d a n c y t e c h n i q u e s t o a c h i e v e h i g h r e l i a b i l i t y in t h e 
c o n t r o l l e r . A n a d d e d a d v a n t a g e in m i l i t a r y a i r c r a f t i s t h e i n c r e a s e d 
s u r v i v a b i l i t y of t h e p l a n e if it is d a m a g e d . F o r o b v i o u s r e a s o n s n o t h i n g 
h a s b e e n p u b l i s h e d a b o u t m o d e r n u s e of m i c r o p r o c e s s o r s in d i g i t a l f l i g h t 
c o n t r o l l e r s , bu t it is k n o w n tha t s e v e r a l d i f f e r e n t t y p e s of m i c r o p r o c e s s o r 
- 11 
i n c l u d i n g t h e T e x a s 9 9 0 0 a n d M o t o r o l a 6 8 0 0 a r e u s e d in r e d u n d a n t 
c o n f i g u r a t i o n s in t h e T o r n a d o . D e e t s e t a l [ 1 5 ) . in d i s c u s s i n g t h e d e s i g n 
a n d f l i g h t e x p e r i e n c e of a f l y - b y - w i r e c o n t r o l s y s t e m , d e s c r i b e t h e f i r s t 
a i r c r a f t to u s e t h e f l y - b y - w i r e t e c h n i q u e . A n A p o l l o s p a c e c o m p u t e r w a s 
i n s t a l l e d in a m o d i f i e d F8 f i g h t e r . T h e 16 b i t c o m p u t e r h a d a 36k w o r d 
m e m o r y . 12|us i n s t r u c t i o n c y c l e a n d p e r f o r m e d t h e c o n t r o l . a l g o r i t h m in 3 0 m s . 
T h e i n i t i a l t e s t s w e r e s u c c e s s f u l a n d w e r e c o n d u c t e d w i t h a p a r a l l e l b a c k 
u p h y d r a u l i c s y s t e m . 
B l a c k e t a l [ 5 ] d i s c u s s t h e d e v e l o p m e n t of a s p a c e b o r n e m e m o r y , w h i c h 
u s e s a m e m o r y e r r o r c o r r e c t i o n c o d e , h a v i n g t h e s a m e n u m b e r of b i t s as t h e 
S E C / D E D H a m m i n g c o d e , b u t w h i c h is a b l e to c o r r e c t d o u b l e bi t e r r o r s ( D B E ) . 
T h e p o s i t i o n of s t u c k s i n g l e b i t e r r o r s is l o g g e d so t h a t if a d o u b l e b i t 
e r r o r o c c u r s , t h e s t u c k s i n g l e b i t e r r o r c a n be " e r a s e d " s i n c e i ts p o s i t i o n 
is k n o w n , a n d a D B E t o l e r a t e d . In o r d e r t o a c h i e v e t h e h i g h r e l i a b i l i t y 
r e q u i r e d in s p a c e a p p l i c a t i o n s u s i n g s e m i c o n d u c t o r r a n d o m a c c e s s m e m o r y 
( R A M ) t h e d e s i g n w a s r e q u i r e d t o t o l e r a t e d o u o i e bi t e r r o r s . 
2 .2 .2 C o m m e r c i a l c o m p u t e r s 
C o m m e r c i a l c o m p u t e r s a r e n o w b e i n g bu i l t w i t h r e d u n d a n t c i r c u i t r y , 
e s p e c i a l l y to r m e m o r y p r o t e c t i o n . T h e a i m is to i n c r e a s e t h e m e a n t i m e to 
f a i l u r e ( M T T F ) by r e d u n d a n t c i r c u i t r y a n d to r e d u c e t h e M T T R by bu i l t i n 
t e s t e q u i p m e n t . H e n c e t h e a v a i l a b i l i t y c a n b e i n c r e a s e d a n d m a i n t e n a n c e 
c o s t s r e d u c e d . T o s c h i e t a l W ^ i n d i s c u s s i n g a f a u l t - t o l e r a n t c o m p u t e r 
m e m o r y , d e s c r i b e t h e a d v a n t a g e s g a i n e d by a d d i n g s i n g l e bi t e r r o r 
c o r r e c t i o n to m e m o r y a n d t h e t o l e r a n c e of t r a n s i e n t e r r o r s . T r o u b l e s o m e 
a n d f a i l e d m e m o r y d e v i c e s a r e m a s k e d by r e d u n d a n c y a n d a r e r e p l a c e d at 
p e r i o d i c m a i n t e n a n c e i n t e r v a l s . S w a r z [ 18 ] d e s c r i b e s t h e d e s i g n 
m e t h o d o l o g y b e h i n d t h e VAX c o m p u t e r w i t h s p e c i a l e m p h a s i s o n t h e a b i l i t y of 
t h e c o m p u t e r to t o l e r a t e m e m o r y , d i s k , a n d o t h e r e r r o r s . T h e c o m p u t e r is 
d e s i g n e d to m i n i m i s e t h e M T T R by t h e u s e of a LSI 11 m i c r o p r o c e s s o r 
- 12 -
d e d i c a t e d t o d i a g n o s i n g f a u l t s in t h e VAX c o m p u t e r . In t h i s w a y t h e 
a v a i l a b i l i t y is i m p r o v e d a n d t h e m a i n t e n a n c e c o s t s r e d u c e d . 
2 .2 .3 T e l e p h o n e e x c h a n g e c o m p u t e r s 
M o d e r n P C M t e l e p h o n e e x c h a n g e c o m p u t e r s m u s t h a v e a v e r y h i g h 
a v a i l a b i l i t y , t y p i c a l l y l e s s t h a n t w o h o u r s d o w n - t i m e in f o r t y y e a r s . 
F a n t i n i e t a l 12] d e s c r i b e a n e x c h a n g e c o m p u t e r b a s e d o n t h e Z 8 0 0 0 16 bi t 
m i c r o p r o c e s s o r w h i c h h a s a f a u l t c o v e r a g e of 0 .98 . A d u p l i c a t e p r o c e s s o r 
is u s e d t o d e t e c t e r r o r s by c o m p a r i s o n w i t h t h e m a i n p r o c e s s o r . T h e p o w e r 
s u p p l y , c l o c k , a n d b u s a r e c o n t i n u o u s l y m o n i t o r e d f o r e r r o r s a n d R A M a n d 
R O M c h e c k i n g i s p e r f o r m e d o f f - l i n e . T h e a i m o f t h e d e s i g n is to d e t e c t a n d 
i s o l a t e f a u l t s . M o s t f a u l t s w e r e f o u n d to b e t r a n s i e n t , bu t t h e s y s t e m 
r e l i e s o n t h e p r o m p t r e p a i r o f p e r m a n e n t f a u l t s t o a c h i e v e a h i g h 
a v a i l a b i l i t y . 
C e r u e t a l [ 1 0 ] d e s c r i b e a s i m i l a r e x c h a n g e c o m p u t e r . T h e 16 b i t 
p r o c e s s o r is c o n s t r u c t e d f r o m d i s c r e t e T T L a n d LSTTL a n d c o n s i s t s of t w o 
p r o c e s s o r s o p e r a t i n g in s y n c h r o n i s m . T h e d e t e c t i o n of a n e r r o r c a u s e s b o t h 
p r o c e s s o r s to c h e c k t h e m s e l v e s a n d e a c h o t h e r , a n d t h e f i r s t o n e to f i n i s h 
t h e c h e c k i n g r e s u m e s c o n t r o l of t h e e x c h a n g e . F a u l t s a r e l o g g e d a n d t h e 
s e c o n d p r o c e s s o r is r e s y n c h r o n i s e d if p o s s i b l e . 
2 .2 .4 S a f e t y m o n i t o r i n g c o m p u t e r 
H a r b e r t 13] d e s c r i b e s a s y s t e m u s i n g t h r e e 8 0 8 6 16 b i t m i c r o p r o c e s s o r s 
i n a f i r e / g a s d e t e c t i o n a n d a u t o m a t i c s h u t - d o w n c o n t r o l l e r . T h e i n p u t 
b o a r d s , m i c r o p r o c e s s o r , m e m o r y , a n d o u t p u t b o a r d s a r e t r i p l i c a t e d a n d t h e 
p r o c e s s o r s o p e r a t e a s y n c h r o n o u s l y . M a g n e t i c b u b b l e s t o r a g e is u s e d s i n c e 
t h i s g i v e s i m p r o v e d r e l i a b i l i t y o v e r s p i n n i n g r n e m o r y d e v i c e s . S e v e r a l 
h u n d r e d d e t e c t o r s a r e m o n i t o r e d by t h e s y s t e m a n d t h e i r s t a t u s d i s p l a y e d o n 
c o l o u r V D U s , w h i c h g i v e a c l e a r d i s p l a y of i n f o r m a t i o n . T h e s y s t e m c o n t r o l s 
e m e r g e n c y s h u t - d o w n a n d e x t i n g u i s h i n g e q u i p m e n t . 
- 13 -
2 .2 .5 P D P 11 c o m p u t e r 
C a n e p a et a i [ 1 9 ] , i n d i s c u s s i n g t h e a r c h i t e c t u r e of m u l t i p r o c e s s i n g 
s y s t e m s , d e s c r i b e a s y s t e m c o n s i s t i n g of t h r e e LSI 11 c o m p u t e r s c o n n e c t e d 
in a v e r s a t i l e t r i p l e x c o n f i g u r a t i o n , H a r d w a r e m o d i f i c a t i o n is m i n i m a l 
e x c e p t f o r t h e c o n s t r u c t i o n of t h e v o t e r s w h i c h u s e 2 5 0 SSI i n t e g r a t e d 
c i r c u i t s . V o t i n g i s p e r f o r m e d at b u s l e v e l w h i c h m a k e s t h e r e d u n d a n c y 
t r a n s p a r e n t t o t h e s o f t w a r e , a l l o w i n g s t a n d a r d s o f t w a r e to be e x e c u t e d . 
T h e p r o c e s s o r s c a n b e c o n f i g u r e d t o o p e r a t e s i n g l y , w i t h o n e p r o c e s s o r 
t a l k i n g to t h e o t h e r t w o , o r in a T M R c o n f i g u r a t i o n w h e r e t h e p r o c e s s o r s 
r u n in s y n c h r o n i s m . T h e s y s t e m h a s b e e n bu i l t to e x a m i n e t h e e f f e c t of 
t r a n s i e n t f a u l t s o n c o m p u t e r s . By m o n i t o r i n g t h e t h r e e p r o c e s s o r s y s t e m 
a n d i n j e c t i n g f a u l t s i n t o o n e c h a n n e l , t hey h o p e to g a i n i n f o r m a t i o n a b o u t 
t h e f r e q u e n c y , d u r a t i o n a n d l o c a t i o n of t r a n s i e n t f a u l t s in a c o m p u t e r . 
2 .2 .6 R a i l w a y a p p l i c a t i o n s 
F o r s y t h e et a l [ 1 6 ] , i n d i s c u s s i n g r e l i a b l e t r a i n c o n t r o l 
a p p l i c a t i o n s , d e s c r i b e a s y s t e m u s i n g t h r e e I N S 8 9 0 0 16 b i t m i c r o p r o c e s s o r s 
w h i c h s h a r e a c o m m o n b u s . E a c h p r o c e s s o r h a s i ts o w n m e m o r y s t o r e a n d b o t h 
p r o c e s s o r s a n d m e m o r y a r e b u f f e r e d to i m p r o v e t h e f au l t i s o l a t i o n . T h e 
t h r e e p r o c e s s o r s s h a r e c o m m o n R A M a n d E P R O M a n d v o t i n g is p e r f o r m e d in 
s o f t w a r e to r e d u c e t h e h a r d w a r e c o s t s . T h e o p e r a t i o n of t h e c o n t r o l l e r is 
d i v i d e d i n t o f o u r s e c t o r s . In t h e f i r s t t h r e e s e c t o r s , e a c h p r o c e s s o r 
e x e c u t e s a t a s k a n d t h e n s w a p s t a s k s w i t h a n o t h e r p r o c e s s o r a t t h e e n d of 
t h e s e c t o r . In t h i s w a y t h e t h r e e t a s k s a r e e x e c u t e d t h r e e t i m e s o n t h r e e 
• d i f f e r e n t m i c r o p r o c e s s o r s . In t h e f o u r t h s e c t o r t h e r e s u l t s o f t h e 
c o m p u t a t i o n s a r e c o m p a r e d a n d r e c o v e r y is e x e c u t e d if r e q u i r e d . 
I n p u t / o u t p u t is p e r f o r m e d in t h e f o u r t h s e c t o r o n l y if t h e e r r o r c h e c k i n g 
is s a t i s f a c t o r y . In s p i t e of t h e h i g h l e v e l of r e d u n d a n c y , t h e s y s t e m w a s 
f o u n d to l o c k - O p a s a r e s u l t o f c e r t a i n i n t e r f e r e n c e t e s t s a n d t h e w a t c h d o g 
t i m e r w a s f o u n d to r e s e t t h e s y s t e m a n d r e s t o r e c o r r e c t o p e r a t i o n . 
- 14 -
2 .2 .7 F a u l t - t o l e r a n t s o f t w a r e 
M u c h of t h e f a u l t - t o l e r a n c e o n l a r g e c o m p u t e r s is i m p l e m e n t e d in 
s o f t w a r e . R e f e r e n c e s ( 8 . 9 , l 1] d e s c r i b e t h e i m p l e m e n t a t i o n of " r e c o v e r y 
b l o c k s " o n a l a r g e m a c h i n e in a h i g h l eve l l a n g u a g e . A r e c o v e r y b l o c k is 
d e f i n e d a s a s e c t i o n of c o d e in w h i c h r e c o v e r y is p o s s i b l e . C a l c u l a t i o n s 
u n d e r g o a s e r i e s of a c c e p t a n c e t e s t s . If t h e a c c e p t a n c e tes t is no t p a s s e d 
a n a l t e r n a t i v e c a l c u l a t i o n is t r i e d . B e f o r e e x e c u t i n g a n a c c e p t a n c e tes t , 
v a r i a b l e s a r e s t o r e d in a " r e c o v e r y c a c h e " s o t h a t t h e v a r i a b l e s c a n be 
r e s t o r e d to t h e i r i n i t i a l s t a t e if t h e a l t e r n a t i v e c a l c u l a t i o n a n d 
a c c e p t a n c e t e s t f a i l s . G h a n i e t a l [111 d e s c r i b e t h e " r e c o v e r y c a c h e " 
h a r d w a r e as i m p l e m e n t e d o n a POP 11 c o m p u t e r . 
2 .3 S U M M A R Y 
M a n y of t h e f a u l t - t o l e r a n t f e a t u r e s r e p o r t e d in s m a l l c o n t r o l l e r s a r e 
u s e d in t h e e x p e r i m e n t a l c o n t r o l l e r . T h e d e s i g n s of P i a t t e t e r [71 a n d 
H i g u c h i e t a l [ 2 2 ] a r e s i m i l a r to t h e g o v e r n o r c o n t r o l l e r w h i c h o v e r c o m e s 
m a n y of t h e s h o r t c o m i n g s In t h e i r d e s i g n s , s u c h as t h e f a i l u r e to 
r e s y n c h r o n i s e . R y l a n d [20 ] p r o p o s e s a d i f f e r e n t T M R s t r u c t u r e a n d r e p o r t s 
t h a t s o f t w a r e w a s w r i t t e n in a s s e m b l e r . T h e g o v e r n o r c o n t r o l l e r s o f t w a r e 
w a s l i k e w i s e w r i t t e n in a s s e m b l e r . D a v i e s e t a l [211 d i s c u s s a n 
a l t e r n a t i v e T M R s t r u c t u r e w h e r e t h e v o t i n g is p e r f o r m e d in s o f t w a r e . T h i s 
s t r u c t u r e is v e r y s u i t a b l e f o r s i n g l e c h i p m i c r o p r o c e s s o r s , bu t w a s n o t 
u s e d in t h e g o v e r n o r c o n t r o l l e r f o r t h e r e a s o n s g i v e n in c h a p t e r s six a n d 
s e v e n . T h e c o n t r o l l e r d e s c r i b e d by L i m [4 ] is d e s i g n e d to fa i l s a f e as is 
t h e g o v e r n o r c o n t r o l l e r , F o o s e (171 r e p o r t s t h a t bu i l t i n t e s t a b i l i t y c a n 
b e i n c o r p o r a t e d at l i t t l e e x t r a c o s t a n d R y l a n d [20 ] m e n t i o n s t h e o n - l i n e 
r e p o r t i n g ot f a u l t s . T h e g o v e r n o r c o n t r o l l e r p e r f o r m s o n - l i n e fau l t 
r e p o r t i n g w h i c h a l l o w s t h e a v a i l a b i l i t y to b e i n c r e a s e d a n d m a i n t e n a n c e 
c o s t s r e d u c e d . 
- 15 -
A l t h o u g h l a r g e c o m p u t e r s a r e o u t s i d e t h e d e f i n i t i o n of s m a l l 
c o n t r o l l e r s , a d i s c u s s i o n o f f a u l t - t o l e r a n t f e a t u r e s is i n c l u d e d f o r t h e 
s a k e of c o m p l e t e n e s s a n d m a n y of t h e i r f a u l t - t o l e r a n t f e a t u r e s c a n b e 
a d a p t e d f o r u s e o n s m a l l m i c r o p r o c e s s o r c o n t r o l l e r s . M u c h of t h e i r f a u l t -
t o l e r a n c e is i m p l e m e n t e d in s o f t w a r e a n d is m o r e c o m p l i c a t e d t h a n t h a t 
r e q u i r e d f o r a s m a l l c o n t r o l l e r . T h e s p a c e s h u t t l e u s e s s e r i a l b u s e s f o r 
t h e i n t e r c o n n e c t i o n of u n i t s s i n c e t h i s is m o r e r e l i a b l e t h a n u s i n g l a r g e 
p a r a l l e l b u s e s . T h e s p a c e s h u t t l e a n d f l y - b y - w i r e a i r c r a f t u s e r e d u n d a n t 
i n e r t i a i s e n s o r s a n d s o f t w a r e is u s e d to s e l e c t t h e bes t i n e r t i a i 
i n f o r m a t i o n . M i d - v a l u e s e l e c t i o n is c o m m o n l y u s e d a s if is a f a s t 
a l g o r i t h m to i m p l e m e n t . T h e g o v e r n o r c o n t r o l l e r , in a s i m i l a r f a s h i o n , 
u s e s r e d u n d a n t p r e s s u r e t r a s n s d u c e r s a n d s e l e c t s t h e bes t p r e s s u r e 
i n f o r m a t i o n u s i n g s o f t w a r e . T i m e is n o t c r i t i c a l , so a m o r e s u i t a b l e 
a v e r a g i n g a l g o r i t h m is u s e d , r a t h e r t h a n t h e f a s t e r m i d - v a l u e s e l e c t i o n . 
R e f e r e n c e s [ 5 . 6 : 1 8 ] d e s c r i b e t h e u s e of s i n g l e b i t e r r o r c o r r e c t i o n in 
s e m i c o n d u c t o r m e m o r y . B l a c k e t a l [5 ] d i s c u s s a c o d e w h i c h w i l l c o r r e c t 
d o u b l e b i t e r r o r s , b u t w h i c h u s e s t h e s a m e n u m b e r of b i t s a s t h e m o r e 
n o r m a l H a m m i n g c o d e . T h e e x p e r i m e n t a l c o n t r o l l e r c a n n o t c o r r e c t d o u b l e b i t 
e r r o r s , bu t a l l o w s r e c o v e r y f r o m t h e m , as w e l l a s t r a n s p a r e n t l y c o r r e c t i n g 
s i n g l e b i t e r r o r s . 
R e f e r e n c e s [ 2 , 1 0 ] d e s c r i b e s i m i l a r t e l e p h o n e e x c h a n g e c o m p u t e r s 
c o n s i s t i n g of a m a i n p r o c e s s o r a n d a s t a n d b y s p a r e . T h i s a r c h i t e c t u r e 
c o u l d h a v e b e e n u s e d in t h e g o v e r n o r c o n t r o l l e r , bu t a T M R s t r u c t u r e w a s 
u s e d in p r e f e r e n c e . F a n t i n i e t a l [2 ] r e p o r t t h a t m o s t f a u l t s e x p e r i e n c e d 
w e r e o f a t r a n s i e n t n a t u r e . R e f e r e n c e s [ 3 , 1 6 ] d e s c r i b e 16 b i t T M R 
c o n t r o l l e r s w h e r e t h e v o t i n g is p e r f o r m e d in s o f t w a r e . T h e g o v e r n o r 
c o n t r o l l e r p e r f o r m s t h e v o t i n g in h a r d w a r e w h i c h is t r a n s p a r e n t to t h e 
s o f t w a r e . C a n e p a et a l (191 d e s c r i b e a T M R s y s t e m , u s i n g t h r e e P D P l l 
c o m p u t e r s , w h e r e t h e v o t i n g i s l i k e w i s e p e r f o r m e d i n h a r d w a r e . T h e 
c o m p l e x i t y of h a r d w a r e v o t i n g in l a r g e c o m p u t e r s is h i g h l i g h t e d by t h e s i ze 
16 
Of t h e r e q u i r e d v o t i n g c i r c u i t r y . 
R e c o v e r y b l o c k s , a s d e s c r i b e d in r e f e r e n c e s ( 8 , 9 , 1 1 ] , c a n u s e f u l l y b e 
i m p l e m e n t e d o n s m a l l c o n t r o l l e r s . T h e g o v e r n o r c o n t r o l l e r u s e s a s p e c i a l 
t y p e of r e c o v e r y b l o c k , m o r e p r o p e r l y c a l l e d a r e c o v e r y v e c t o r , w h i c h 
a l l o w s v e c t o r e d r e c o v e r y to b e e x e c u t e d f o l l o w i n g t h e d e t e c t i o n of a 
h a r d w a r e o r s o f t w a r e e r r o r . 
17 
CHAPTER 3 
FAILURE OF COMPONENTS 
3.1 FAILURE DISTRIBUTION AND MECHANISMS 
The laiiure rate of most types of electrical devices follows the 
Classical Dath-tub" curve of figure(4). Phase one. infant mortality, 
represents the early life failures of a device and is usually associated 
with one or more manufacturing defects. After several hundred hours, the 
failure rate approaches some constant low value. . phase two, where it 
remains for anything from several years to several hundred years and 
failures occur randomly. Wearout failures, phase three, occur at the end 
of the useful life of a device and are characterised by a rapidly rising 
failure rate with time as the device wears out both physically and 
electrically. A comrnon wearout mechanism in integrated circuits is 
corrosion due to moisture trapped inside the device package. Under normal 
operating conditions, failure due to wearout is rarely experienced with 
integrated circuits, unless they are operated for a very long period of 
time - references(34.27]. 
The constant failure rate region, phase two, represents failures due 
to random events such as electrical surges. Most failure rate data sources 
assume a constant failure rate and present their results in the form of n 
failures per unit time, typically n fai lures/1o' hours. This is 
satisfactory as long as the wearout region is not encountered during the 
lifetime of the equipment. This assumption is likely for integrated 
circuits and resistors, but components such as electrolytic capacitors may 
be different. The wearout phase of an electrolytic capacitor results from 
the electrolyte drying up. which might occur after only a few years. 
However the quoted "constant" failure rate might suggest a MTTF of several 
hundred . years. Thus it is essential to distinguish between the useful life 
- 1 8 -
Of a component and MTTF if wearout is encountered. 
An example of tfie misleading result tfiat tliis confusion migfit produce 
can be given by considering the human life span. The mortality rate for 
humans approximately follows the "bath-tub" curve, but if only the constant 
"failure rate" experienced during youth and middle-age is considered, then 
a MTTF of two thousand years is predicted as opposed to a normal lifespan 
of about seventy five years. 
The majority of components used in "small digital controllers" are 
integrated circuits, resistors, and small decoupling capacitors. It is 
therefore valid under favourable operating conditions to consider the 
failure rate of such components constant and to assume that their useful 
life is equivalent to the MTTF. 
Failure mechanisms in MOS integrated circuits are given in tabled) 
and are discussed in more detail in references(27,35]. Most potential 
failures caused by these mechanisms can be detected early by suitable 
screening. Consideration of these mechanisms is important since the 
majority of microprocessor and memory components are fabricated using the 
MOS technology. 
3.2 FAILURE RATE MEASUREMENT AND ACCELERATION 
The simplest method of calculating the failure rate of devices under 
controlled or field testing is by the equation : 
failure rate = number of failures 
no. devices X no. hours tested 
The problem with this equation is that statistically it only 
represents a point estimate at 50% confidence, and that if testing reveals 
no failures, then obviously the failure rate is not zero. A statistical 
solution to this problem is given by the Chi-squared distribution as 
discussed in references[32,491. 
- 19 -
failure rate = •— - . ' (3.2.1) 
2 n t 
where: Chi-square function 
CL= confidence level expressed as a decimal 
r = number of failures 
n = number of parts tested 
t = total test duration 
Tables ofOc" are found in many texts on statistics[49]. 
Equation(3.2.1) is almost universally used, with most failure rates 
quoted at the 60% confidence level. 
3.2.1, Accelerated testing 
Even "unreliable" microelectronic devices may last several thousand 
hours before failing which makes life-testing a very long process and 
screening virtually impossible. It has been shown that the failure rate of 
microelectronic devices exponentially increases with temperature according 
to the Arrhenius reaction rate equation. references[35,37.38]. 
failure rate = C exp( -Ea/KT ) (3.2.2) 
where: Ea = activation energy in eV 
K = Boltzmanns constant ( 8.63 x 10 ^) 
T = absolute temperature 
C = an appropriate constant 
When conducting accelerated tests and analysing test data it is 
important to remember two things : 
- 2 0 -
(i) 
(ii) 
The failure rate is exponentially dependent on temperature so 
that incorrect specification of the device junction temperature 
will have a large effect on the failure rate. 
The correct activation energy should be chosen, appropriate to 
the failure mechanism under consideration. 
According to the Arrhenius equation it is possible to accele^^^is. 
5icv(^i|;rai-vf/i^ failures by testing at elevated temperatures and the 
acceleration factor is calculated by : 
Fa = exp 
K 
J _ _ - L (3.2.3) 
where : Fa - acceleration factor 
T l = test temperature of the junction 
T2 = desired temperature of the junction 
The graph of figure(7) shows the relationship between Fa and the test 
temperature. T l . for a desired temperature, T2. of 25° C. A family of 
curves is plotted, showing the sensitivity of acceleration factor to 
activation energy. Activation energies chosen are those corresponding to 
the major failure mechanisms of tab led) . 
When considering failures due to moisture ingression into the 
microcircuit package and subsequent corrosion of the metalisation and 
bonding wires, a similar acceleration factor may be used, it is common to 
test under conditions of 85 " C/85%rh, references[39.43). The acceleration 
factor may be calculated according to the Lawson-Harrison law, references 
[35.44] and is given by : 
Fa = exp (3.2.4) 
- 2 1 -
where : Fa = acceleration factor 
Ea - activation energy 
K Boltzmanns constant 
T l = desired junction temperature 
T2 - test junction temperature 
H2 = test humidity 
HI - desired humidity 
b - a constant 
ReynoidslSS] uses values of Ea = O.GeV as per tabled) and b=4.4 . 
3.3 FAILURE RATE PREDICTION 
Failure rates may be predicted either by accelerated testing or a 
combination of accelerated testing. field testing. and mathematical 
modelling. The most widely used document for failure rate prediction is 
probably MIL-217 [37], prepared by the American Department of Defense at 
the Rome Air Development Centre (RADC). This document is regularly updated 
and has been published as MIL-217 A,B,C and recently D. The first to 
predict the failure rate of microelectronic devices was MIL-217B. 
Components covered are those mainly used in defence applications, but this 
covers resistors. capacitors. most integrated circuits, connectors, 
switches etc. and field data is mainly gathered from defence applications. 
A mathematical failure rate is developed for each type of component of the 
form : 
X = Tr«)"nETrft (3.3. D 
where : ^ = failure rate 
^ base failure rate 
- quality factor 
2 2 
TTE = environmental factor 
TTA - application factor - voltage stress, power rating etc. 
The factors At . TTp • TTE. • TTfl etc. are tabulated in MiL-217 covering many 
operating conditions and grades of component. 
The failure rate of components depends on their quality and the MIL-
217 series attempts to allocate TT^  or quality factors according to the 
grade of manufacture and subsequent screening. 
An environment factor. TTE • "s applied to take account of the 
operating environment of the component. Experience has shown that failure 
rates depend on the operating environment, as might be expected. For 
instance a missile launch is more hostile than an aircraft in flight which 
in turn is more hostile than a ground based environment. For the purpose 
of evaluation of industrial equipment, the environment chosen as being 
appropriate is "ground fixed". Gf. 
Finally the application factor. TTA . is taken into account. This 
factor takes many forms, but is mainly used to reflect the electrical 
stress under which the component is operating. Any derating of the 
voltage, current, or power handled by the component will result in an 
improved value of TT/i • 
The failure rate model for MOS and bipolar devices is of particular 
interest since this covers TTL logic and most microprocessor and memory 
devices excluding ROMs. The model used is : 
A=Tr<pTru[CiTrr¥v+{Cz+C3)TrE ] tailures/IO^ hours (3.3.2) 
where : X = device failure rate 
TTip- quality factor - depends on grade and screening level 
TTL ~ device learning factor - unity for a mature device 
Jlr = temperature acceleration factor - Arrhenius relation 
23 -
TTv= voltage stress factor - unity except for CMOS 
CI = device complexity factor - depends on transistor count 
02 = device complexity factor - depends on transistor count 
C3 - package complexity factor 
TTf = environment factor 
Since Jlr is exponentially dependent on temperature, then for high 
temperatures. 1]"^ is large and the model can be simplified to the 
approximation : 
X=Tr9TTTCi (3.3.3) 
The failure rate is therefore exponentially dependent on the junction 
temperature of the device. 
3.3.1 CNET 
In 1972 the Comite' 'de Coordination des Tele'communications in France 
decided to establish a group of people to evaluate and predict the 
reliabil i ty. of components used in the telecommunications and computing 
industries. The first version of their report was published in 1976. 
reference[38J. This report was based on MIL-217B (an earlier version of 
MIL-217D [37], but was biased towards telecommunication and computer 
equipment operating in favourable environments as opposed to defence 
equipment operating in hostile environments. Failure rate models are given 
which, are similar to MIL-217. The model given for microcircuits was 
considerably different, but was updated in 1982 to the following model 
which is more similar to the MIL-217 model. 
A - T T ^ T T L [CilTr"fTt"n'v+ CiTTftTreTrs] failures/io"' hours (3.3.4) 
- 2 4 -
where : A = device failure rate 
TTp= quality factor - depends on grade and screening level 
TTL - reliability growth factor - unity for a mature device 
TTfc - temperature acceleration factor - Arrhenius relation 
TTr = device technology factor 
TTv - voltage stress factor - unity except for CMOS 
CI = device complexity factor - depends on transistor count 
C2 = device complexity factor - depends on transistor count 
T T E - environment factor 
TT8= humidity / temperature factor 
TTs = transportation factor - depends on no. of journeys 
3.3.2 National Centre of Systems Reliability 
The NCSR reliability data [34] is a condensed version of their 
computer data bank held by the Systems Reliability Service, SRS. The data 
bank comprises two main parts. The first contains field data gathered from 
the nuclear industry where the environmental conditions are well known and 
controlled. The second part contains information from MIL-217C as well as 
data from other published sources, laboratory tests and theoretical 
predictions. Martin Marietta Aerospace provide much of the support for the 
RADC data which in turn influences the MIL-217 series. The SRS data bank 
contains this data, so the SRS data will not be an independent source to 
MlL-21 7. 
The failure rate models are similar to MIL-217 with the exception of 
the microcircuit model, which is : 
F = K1 Kg ( Fel + Fe2 + Ft ) failures/ 10^ hours (3.3.5) 
where : F = failure rate for a hermetic device 
K1 = unity for mature devices otherwise ten 
- 25 -
Kg = reliability growth factor 
Fel - transistor count complexity / environment factor 
Fe2 = packaging factor - depends on no. of pins and environment 
Ft = temperature accelerature factor dependent on packaging 
The model only considers two grades of device, hermetic and non-hermetic. 
The non-hermetic device failure rate is equal to twice the hermetic failure 
rate as welt as further adjustments incorporated into Ft. 
3.3.3 RADC field data 
As well as publishing the MIL-217 series, the RADC publishes failure 
rate data obtained from field experience. Klein [45] presents much field 
data on microcircuits which is shown to a^^-e^ a^^iv^/i^a-i-eU^ with MIL-217C. 
3.3.4 European Space Agency 
The ESA has its own data bank and requires companies wishing to 
fender for projects to perform a reliability prediction, using the failure 
rate data generated and supplied by themselves. This has the great 
advantage that all companies are forced to use the same failure rate data 
and a valid comparison between proposed designs can be made. 
3.3.5 Manufacturers testing 
Most microcircuit manufacturers publish the results of accelerated 
testing on their devices, references[27,40]. Devices are subjected to 
thermal and physical shock tests as well as dynamic testing at elevated 
temperatures. A large number of devices are tested for typically several 
thousand hours, and the number of failures observed are fitted to a Chi-
squared distribution. The failure rate is typically quoted at the 60% 
confidence level and the failure rates are further modified according to 
- 2 6 -
the Arrhenius acceleration factor, to give a failure rate appropriate to 
the likely conditions of usage. There is some discrepancy between 
different manufacturer's predictions, since they make use of different 
activation energies as applied to the Arrhenius acceleration equation. 
Tests are also performed to verify the suitability of device packages as 
regards shocK and humidity as reported by Motorola [40]. 
3 .3 .6 Simple models 
A simple model for microcircuits is quoted by the RRE which is useful 
when very tew parameters are known : 
X AaKd Kl (3.3.6) 
where : Ae = 5 digital bipolar 
A s = 6 digital MOS 
A a = 12 linear bipolar 
Kd die area in square inches 
0.015 
Kl = 1 + N - 12 
24 
N = number of package leads 
For this model to apply, the ambient temperature must not exceed 55°C, the 
junction temperature must not be greater than 40° C above ambient, and 
plastic encapsulation must not be used. If plastic encapsulation is used, 
then it is recommended that the failure rate is doubled. 
3.3.7 GIDEP computer data base 
The Government Industry Data Exchange Programme is an American based 
association which was established in 1959 and provides access to four data 
2 7 
banks, which are : 
(i) Engineering data bank 
(ii) Reliability-maintainability data bank 
(iii) Failure experience data bank 
(iv) Metrology data bank 
The reliability-maintainability data bank is of particular interest as 
regards reliability prediction. 
3.3.8 EuRePata 
The European Reliability Data Bank Association was established in 1974 
on a voluntary basis and was formally constituted in 1979. The 
organisation is non profit making and is supported by the EEC. Its main 
aims are to promote data exchange between organisations and to set up 
standard methods for obtaining and using reliability data. 
3.3.9 MIL-217 Data base 
The failure rate models and factors contained in the MIL-217 series 
ideally lend themselves to computerisation, since mathematical formulae are 
given for all the failure rate factors. TTr^ TTp etc. The Predictor package 
[811 includes the computerisation of MIL-217. The CNET data, likewise, is 
suitable, however data such as NCSR [34] is not suitable since no formulae 
are given for the failure rate factors, but only tables of figures. 
3.4 COMPARISON OF FAILURE RATE DATA 
The laiiure rates for typical components used in a digital controller 
are given in tables(2) to (16). There are two main sources of data : 
u) Field and accelerated life testing data, 
(ii) Predicted failure rates based on models such as M I L 217. 
2 8 -
3.4.1 Resistors 
The lailure rates of oxide film resistors are compared in table(2). 
Of the pedicted data, the CNET prediction is identical to MIL 217 which 
A 
might be expected since the CNET data is based on MIL 217B. The NCSR 
prediction is double that of the other predictions, but fits in between the 
two failure rates based on field experience. In addition the NCSR data is 
based on field experience within the nuclear industry, so it is proposed 
ihat the NCSR failure rate is the value most likely to be correct. There 
Is however good agreement between all values with the worst and best 
failure rates only differing by a factor of four. 
3.4.2 Capacitors 
The failure rates of a O.luF decoupling capacitor are given in table 
(3). This time agreement is not good with the worst and best values 
differing by a factor of twenty. This difference may be partly due to the 
wide variation in capacitor types and it is not possible to make 
predictions tor identical capacitor types. However the CNET and MIL 217 
values are in close agreement together with the ICL field data, as was the 
case for resistors, so it is proposed that the MIL 217 failure rate is 
accepted for this type of capacitor. 
3.4.3 Soldered joints 
The failure rates of soldered joints are compared, in table(4). 
Ignoring the CNET prediction which seems to be low. there is good agreement 
between the other values and again the ICL field data agrees very closely 
with MIL 217. It is proposed that the MIL 217 value is accepted. 
3.4.4 Wire-wrap connections 
I he lailure rates of wire-wrapped joints are compared in tabie(5). 
The MIL 217 and CNET predictions seem grossly optimistic when compared with 
the tield data and soldered joints. In the case of the MIL 217 data it is 
29 -
suspected that very few wire-wrap joints are used in military equipment, 
but many more are used in the computer industry. For this reason and 
because the ICL data and Dummer agree so closely, it is proposed that the 
field data is more likely to be correct. 
3.4.5 Edge connectors 
The failure rate of edge connectors is compared in table(6). it is 
difficult to compare failure rates because of the wide variety of edge 
connectors and the mating/unmating cycles are not known for the field data. 
The Dummer failure rate seems high whilst the MIL 217 rate seems low when 
compared with field data as well as soldered and wire-wrapped joints. 
Although the CNET data is based on MIL 217. the CNET failure rate for 
connectors is much higher (a factor of 30). Presumably CNET found the MIL 
2 17 model to be too optimistic. The CNET prediction agrees with the ICL 
data and seems to be a sensible value, so it is proposed that the CNET data 
is used in preference to the other sources. 
3.4.6 Integrated circuits 
The failure rates of integrated circuits are several orders of 
magnitude greater than resistors, capacitors, or connections, and therefore 
because of the large number of integrated circuits used in a digital 
controller, their failure rate dominates the total controller failure rate. 
The failure rate of integrated circuits has been shown to fit the Arrhenius 
relationship as discussed in section 3.2. which means that the failure rate 
is exponentially dependent on junction temperature and activation energy. 
If is therefore essential to make comparisons between different reliability 
dnta :.;oijrcos at equivalent activation energies and junction temperatures, 
otherwise the exponential dependence will introduce large differences 
between the failure rates. The activation energies used by MIL 217D. MIL 
217C and NCSR appropriate to different device technologies are given in 
- 3 0 -
table(7). The higher the activation energy, the higher the failure rate at 
an elevated temperature. 
The mam difference between MIL 217C and MIL 217D is the section on 
microcircuits. Only two activation energies used in the calculation of TT.^  
are used in MIL 217C and no distinction is made between hermetic and non-
hermetic (plastic) encapsulation. The revisions incorporated in MIL 217D 
seem to improve the predictions. Nine different activation energies are 
used according to device technology and a distinction is made between 
hermetic and non-hermetic encapsulation with a higher activation energy 
quoted for non-hermetic devices. This seems reasonable since non-hermetic 
devices are more prone to corrosion due to moisture, and this failure 
mechanism is shown to have a high activation energy as shown in tabled). 
The NCSR data only uses three activation energies which agree closely with 
MIL 217D for hermetic devices. The NCSR activation energies make no 
distinction between hermetic and non-hermetic devices and it is simply 
recommended to multiply the failure rate by two for non-hermetic devices. 
This approach is felt to be too simple. 
The CNET failure rate model uses two activation energies of 0.3 and 
l.OeV in the calculation of TTt , the temperature acceleration factor. The 
weighting between 0.3 and l.OeV is varied according to device technology. 
This seems to be a better approach than MIL 217D and NCSR which take the 
rather simplistic view that only one activation energy is present when many 
activation energies may all be making contributions through different 
failure mechanisms. 
Some manufacturers accelerated testing makes use of two activation 
energies. references[23.29], although one activation energy is seen to 
dominate. 
The junction temperature is calculated according to : 
Tj = Tamb + eja.Pdiss (3.4.1) 
- 3 1 -
where: Tj = junction temperature 
Tarnb ^ ambient case temperature 
0ja = junction/ambient thermal resistance 
Pdiss= power dissipated 
If Pdiss is small as in small CMOS or LSTTL devices, the temperature rise 
due to 0ja.Pdiss will be small and the failure rate will be relatively 
insensitive to these two parameters. However in the case of a 
microprocessor which dissipates about 0.5W. and can physically be felt to 
run warm, it is important to. ( ^ e f e v i ^ ' i n ^ - accuta-kU^ 0ja,Pdiss, Motorola[42] 
give values of 0ja from 7 0 - n 5 ° C / W for a plastic device and 50° C/W for a 
ceramic device. Considering MIL 217D and the 6800 microprocessor. 0ja is 
correctly given as 50° C/W for a hermetic device, but the worst case power 
dissipation of IW is given. It would seem more reasonable to use the 
typical power dissipation of 0.5W which reduces the junction temperature by 
o 
25 C. giving a large increase in predicted reliability. 
if 0ja cannot be determined from manufacturers data or the table in 
MIL 217D. then MIL 217D recommends that the following values are used : 
Package type 0ja °C/W 
X 22 pin he'metic 30 
< 22 pin non-hermetic 125 
> 22 pin hermetic 25 
> 22 pin non-hermetic 100 
The NCSR data recommends the following values for 0ja : 
0ja = 6 0 C/W 
0ja - 155°C/W 
hermetic devices 
non-hermetic devices 
Compared with the recommendations of MIL 217D and Motorola, the NCSR 
- 3 2 
figures seem excessively high. 
Although the power dissipation. Pdiss, is given in MIL 217D in the 
same table as 0ja, it is recommended that the manufacturers data is 
consulted instead since (for example) the power dissipation of an 8080 
microprocessor is incorrectly given as 1.7W and not 1.5W as given by 
Intel [33]. It is further recommended that a more realistic failure rate 
is obtained by using the typical power dissipation, as quoted in the 
manufacturers data and not the maximum power dissipation. If possible the 
device case temperature should be measured to give an accurate value of 
case ambient. 
3.4.7 TTL integrated circuits 
The failure rates of TTL integrated circuits are compared in table 
(8). The junction temperature of the ICL devices is unknown, but all other 
predictions are based on a junction temperature of 33° C, The four 
predicted failure rates use very similar activation energies, Ea. The MIL 
217C prediction is obviously too high by at least an order of magnitude, 
but the other failure rates are seen to vary from worst to best by a factor 
of six. The predicted failure rates are pessimistic when compared with the 
ICL field data, although this difference may be due to lower junction 
temperatures of the ICL devices. If the CNET, MIL 217D and NCSR 
predictions are compared, they are seen to vary by a factor of three which 
is thought to be very good. Since the CNET prediction falls between the 
other two predictions, it is proposed that the CNET data is accepted in 
preference to the other two for TTL failure rate prediction. 
3.4.8 6800 Microprocessor 
The failure rates for a 6800 microprocessor are compared in table(9). 
The activation energies and junction temperatures used to calculate the 
failure rates are those recommended by the respective reliability data 
- 33 -
sources and are seen to vary cons iderab ly . All the predic t ions use a 
junc t ion tempera tu re which is at least 15° C too high. Again MIL 217C is 
seen to be too high by at least two orders of magni tude, a l though this is 
part ly due to us ing a h igher act ivat ion energy than the other predic t ions. 
T a b l e d OJ shows the fa i lure rates of tab leO) converted to a c o m m o n 
base ot junc t ion tempera tu re and act ivat ion energy. An activat ion energy 
01 I.OeV is c h o s e n , s ince this is the value used by Motorola as well as the 
MlL~STD 883 sc reen ing p rocedure . MIL 217C is ignored s ince this is much 
h igher than the other va lues, however both MIL 217D and NCSR predic t ions 
are seen to agree c losely and give a sensib le • value of about one fai lure 
per twenty years. The Motoro la fa i lure rate seems to be too low and 
co r responds to one fa i lure per thousand years. 
3.^.9 8080 M ic rop rocesso r 
The fa i lure rates of an 8080 mic roprocessor are compared in t a b l e d 1). 
The act ivat ion energ ies are very s imi lar except for MIL 217C. it is 
in teres t ing to note that Intel chose an act ivat ion energy of O.SeV as 
c o m p a r e d with l.OeV used by Motoro la. The junct ion temperatures vary 
cons ide rab ly , a l though the junc t ion temperature used by Intel is unknown, 
but . probably a round 8 9 ° C. The s imple RRE fa i lure rate model given in 
sec t ion 3.3 is seen to give a reasonab le resul t , a l though probably too high 
by an order of magn i tude. Aga in MIL 217C is seen to be too high by about 
two o rde rs of magn i tude . 
T a b l e d 2 ) shows the fa i lure rates of t a b l e d 1) conver ted to a c o m m o n 
base of j unc t i on tempera tu re and act ivat ion energy. With the except ion of 
MIL 217C, the p red ic t ions agree c losely, a l though they are higher than the 
p red ic t ions for the 6800 by a factor of four. This is because of the 
h igher junc t ion tempera tu re of the 8080 mic roprocessor . The Intel and 
Motoro la fa i lure rates de te rm ined by acce le ra ted test ing are seen to agree 
c iose iy . a l though tney are felt to be too low. 
- 34 -
3.4.10 EPROM 
The fa i lure rates of 2716 EPROMs are compared in table(13) at a c o m m o n 
junc t ion tempera tu re . With the except ion of MIL 217C. the fai lure rates 
agree wi th in an order of magn i tude which is mucf i better than for 
m i c rop rocesso rs . The CNET fa i lure rate is seen to agree most closely with 
the manu fac tu re rs fa i lure rate. An act ivat ion energy of O.SSeV is used by 
both MIL 217D and NCSR s ince this is the energy cor respond ing to NMOS 
dev ices. Intel have found that 0.3eV is more sui ted to acce lera ted test ing 
of EPROMs. If the MIL 217D and NCSR fai lure rates were ca lcu la ted using an 
act ivat ion energy of 0.3eV. then agreement between the fa i lure rates would 
be better. 
3.4.1 1 Bipolar ROMs 
The fa i lure rates for a I k b ipolar ROM are compared in table(14). 
With except ion of MIL 2 1 7 0 , the fa i lure rates all agree very closely and it 
is imposs ib le to suggest that one value is more bel ievable than another. 
It is in terest ing to note that all rel iabi l i ty data sources use nearly 
ident ica l act ivat ion energ ies . 
3.4.12 Dynamic RAM 
The fa i lure rates for a 16k dynamic RAM are compared in t ab iedS) . 
The MIL 2170 value seems too h igh , whi lst the Motorola f igure seems too 
low. If the Motoro la fa i lure rate, having an act ivat ion energy of I.OeV, 
is conver ted to a fa i lure rate, having an act ivat ion energy of 0.3eV. then 
it may be c o m p a r e d sensib ly with the Intel fa i lure rate. Both the MIL 217D 
and NCSR fa i lure rates a re h igh when compared with the manufacturers 
acce le ra ted test resu l ts and the CNET pred ic t ion seems to agree more 
c lose ly as found by Reynolds 135]. 
3.4.13 Static RAMs 
The fa i lure rates of I k stat ic RAMs are compared in t ab ie ( i 6 ) . With 
- 35 -
the except ion of MIL 217C. the fa i lure rates agree closely and it is 
imposs ib le to suggest that one data source is preferab le to another. 
3.4.14 Compar ison with other f ield data 
Klein [45] compares f ield fa i lure rates with MIL 217C predic t ions. 
For all dev ices except PROMs, MIL 217C is found to be pessimist ic by up to 
two o rders of magn i tude . For PROMs, MIL 217C is shown to be too high by an 
o rder of magn i tude for some dev ices , whi lst it is found to be too low by an 
o rder of magn i tude for other dev ices. For m ic roprocessors MIL 217C is 
pess imis t ic by up to two orders of magni tude and observed fai lure rates are 
in the range 0.3-2.0 f /m i l l i on hours. For RAMs. MIL 217C is cor rec t for 
some dev ices, but is pess imis t ic by up to two orders of magni tude for the 
major i ty of dev ices, and observed fai lure rates are in the range 0.1-20 
f /m i i l i on hours. For ROMs. MIL 217C is pessimist ic for most devices by an 
o rder of magn i tude and observed fai lure rates are in the range 0.08-0.8 
f /m i l i i on hours . 
Reynolds [35] conc ludes that MIL 217C is pessimist ic for most 
m ic roc i r cu i t s and that the CNET pred ic t ions are more appropr ia te to "ground 
f ixed" app l i ca t ions . 
Danie ls et al [46] c o m p a r e observed fa i lure rates based on f ield data 
with p red ic t ions made acco rd i ng to MIL 217C, incorpora t ing Not ice 1 (May 
1980), for twelve p ieces of equ ipment of two types, one being analogue and 
the other a d ig i ta l con t ro l le r . The compar i son shows that 79% of pred ic ted 
values are wi th in a factor of two of the observed values, a l though MIL 217C 
is always pess imis t ic . In fact where m ic roc i rcu i t s are conce rned , it is 
p roposed that they should have found there to be wider d isagreement between 
pred ic ted and observed values. It was assumed as a basis for the 
compa r i son that the tempera tu re r ise within equ ipment is only 10°C above an 
amb ien t tempera tu re of 2 5 ° C. Thus the fa i lure rate of a 6800 
m ic rop rocesso r is p red ic ted at a junct ion temperature of 35° C. This 
- 36 -
junc t ion tempera tu re is much too low as previous d iscuss ion has shown. 
Assuming a therma l res is tance of 0 ja= 50" C/W and a typical power 
d iss ipat ion of 0.5W. the tempera tu re r ise within a 6800 mic roprocessor will 
be at least 25° 0 , which gives a junc t ion temperature of 50° 0. Since the 
types of m ic roc i r cu i t s used in the digi tal cont ro l le r are not speci f ied it 
in imposs io ie to conc lude how pess imis t ic MIL 2170 is. especia l ly for 
fTticrocircuit. ' i . 
Clar idge 14 7J makes a compar i son between observed and pred ic ted 
fa i lure rates of a la rge ins t rumenta t ion and pro tect ion system. A l though 
very few in tegra ted c i rcu i t s a re used in the system, a useful lesson can 
probably be learnt . The system compr i ses about nine hundred sub -un i t s , 
each con tam ing about forty componen ts . If the observed fa i lure rates of 
the sub -un i t s a re c o m p a r e d with the pred ic t ions , then the fa i lure rates are 
seen to vary by about two orders of magni tude. If however the fa i lure rate 
of the systems cons is t ing of n ine hundred sub-un i t s are compared , then the 
p red ic ted and observed fa i lure rates vary by a factor of four with the 
p red ic t ion a lmost always pess imis t ic . Unfortunately the data sourc,e used 
for fa i lure rate p red ic t ion is not quo ted , but this example shows that 
observed and p red ic ted fa i lure rates can be shown to agree more closely if 
a compa r i son is made between a system conta in ing many thousand components , 
rather than a hundred or so. A stat ist ical analysis is always seen to be 
more accura te if the samp le size is inc reased. 
3.4.15 Recommenda t i ons 
The most widely used data source for rel iabi l i ty predic t ion is 
probably the MIL 217 ser ies . It has the advantage that most commonly used 
c o m p o n e n t s are l is ted, it has been shown that many of the predic t ions of 
MIL 217D. espec ia l ly those for m ic roc i rcu i t s are pessimist ic , a l though it 
is better to err on the s ide of caut ion . If a more accurate fai lure rate 
p red ic t ion i::. requ i red , then it is suggested that the fol lowing la i lure 
37 -
rates and rel iabi l i ty data sources are used for the fo l lowing components 
Componen t type Data source / fa i lure rate 
Resistor NCSR 
Decoup l ing capac i to r MIL 217D 
So ldered jo int MIL 217D 
W i r e - w r a p connec t i on 0.0008 f /M hrs 
Connec to rs CNET 
TTL in tegrated c i rcu i t s CNET 
M ic rop rocesso rs CNET 
EPROM CNET 
Bipolar PROM CNET / MIL 217D 
Dynamic RAM CNET 
Stat ic RAM CNET / MIL 217D 
The f ind ings of this sect ion as well as Reynolds [35] indicate a 
p re fe rence for the use of CNET data for m ic roc i rcu i t s , rather than MIL 
217D, a l though MIL 217D should not be in er ror by more than one order of 
magn i tude . 
Whichever rel iabi l i ty data source is used, is is recommended that for 
in tegra ted c i r cu i t s , the the rma l res is tance 0 ja is determined f rom the 
manu fac tu re rs data or. fa i l ing that, f rom MIL 217D as descr ibed previously 
in this sec t ion . The typical power d iss ipat ion, determined f rom the 
manu fac tu re rs data , shou ld be used to ca lcu la te the device junct ion 
tempera tu re . Tj. 
3.5 COMPARISON OF DIFFERENT DEVICE TECHNOLOGIES AND ENCAPSULATION 
For a given type of encapsu la t ion , the fai lure rate depends on the 
aev ice techno logy m two ways : 
- 38 -
(ij Act ivat ion energy - The act ivat ion energ ies co r respond ing to the 
fa i lure mechan i sms of d i f ferent techno log ies are given in 
table(7) and d iscussed in sect ion 3.4. There is fairly good 
ag reemen t between manufac tu rers and data sources for fai lure rate 
p red ic t ion . 
(ii> Power d iss ipat ion - The device junc t ion temperature depends on 
tne ambient tempera tu re and the temperature r ise within the 
m ic roc i r cu i t package. which is dependent on the power 
d iss ipa t ion . Techno log ies such as TTL and ECL exhibit much 
h igher power d iss ipat ion than CMOS. 
3.5.1 CMOS vs TTL logic 
It IS shown by Peterson [43] that non -he rme t i c CMOS devices are more 
sensi t ive to mo is tu re act ivated cor ros ion and sur face instabi l i ty than TTL 
dev ices. Many CMOS dev ices were found to fai l because proper t ies such as 
input cu r ren t and leakage cur ren t inc reased , sending the device out of 
to le rance . This is probably to be expected because of the high input 
impedance of CMOS. If hemet ic CMOS is cons idered , then it is sti l l 
repor ted by Johnson et al [48] that TTL devices general ly exhibit a much 
h igher rel iabi l i ty. 
The compa r i son between CMOS and TTL is to some extent a t rade-o f f 
between tempera tu re and act ivat ion energy. Bipolar TTL devices dissipate 
more power than CMOS, there fo re have a h igher junct ion temperature , but 
thei r ac t iva i ion energy is lower. CMOS devices fail accord ing to a high 
act ivat ion energy , but d iss ipate neg l ig ib le power. Peterson {431 conc ludes 
that p las i ic encapsu la ted CMOS has a fa i lure rate live to thirty t imes thai 
of p last ic TTL dev ices. A l though there is cons iderab le evidence suggest ing 
Uiat CMOS dev ices are less re l iab le than TTL. the f indings of Motorola [39] 
suggest ih. i i CMOS. NMOS and HMOS have approximately equal fai lure rates, 
hacod on ne<'jrly tiix mi l l ion hours of les l ing al .]25° C on seven thousand 
- 39 
dev ices. 
Unless des ign cons idera t ions dictate otherwise, it is recommended that 
TTL logic is, used in p re fe rence to CMOS for the above reasons. ' 
3.5.2 Hermet ic vs n o n - h e r m e t i c encapsu la t ion 
There are basical ly th ree types of hermet ic package : 
(i) Ceramic package with metal lid - This type of encapsula t ion is 
c o m m o n for VLSI dev ices and cons is ts of a p iece of ce ramic 
mou lded a round the lead f rame with a cent ra l "wel l" in the 
package. The d ie is p laced in the wel l and w i re -bonded to the 
lead f rame. The dev ice is then sealed by brazing or so lder ing on 
a meta l l id . The package is avai lable in both DIL and leadless 
ch ip car r ie r . 
(ii) Cerd ip - This package is avai lable in both DIL and flat pack and 
cons is ts of a lower s lab of ce ram ic which has the lead f rame 
mou lded into it. The die is p laced on top of the lead f rame and 
w i r e - b o n d e d to it. The package is sealed by a ceramic lid with a 
g lass seal a round the edges. 
(iii) Metal can - This type of package is rarely used for logic devices 
and is main ly used for l inear and hybrid devices. The 
m ic roc i r cu i t s are con ta ined within a sealed metal can with 
connec t i ons pass ing th rough glass seals on the bottom of the 
package. 
N o n - h e r m e t i c packages cons is t of a die at tached and w i re -bonded to a 
lead f rame and then the die and lead f rame are encapsu la ted using an epoxy, 
s i l i cone , or pheno l ic res in . 
Apar t f rom other cons ide ra t i ons , hermet ic devices have a lower therma l 
res is tance (typical ly 5 0 ° C/W) than plast ic devices (typically 100° C/W). 
This d i f fe rence between p last ic and ce ram ic devices may be reduced with the 
in t roduc t ion of copper lead f rames instead of a lumin ium as current ly used. 
- 40 -
Motoro la [39] repor t the advantage of us ing a copper lead f rame and the 
improvement in plast ic device thermal res is tance. For aevices which 
d iss ipate a lot of power, the junct ion temperature of hermet ic devices win 
be lower than that of plast ic devices which should give a more re l iable 
device. If dev ices are to be used over the full mil i tary temperature range 
of -55' 'c to + I25 ' 'c . then it is necessary to use ce ramic devices. 
Re fe rences l49 .50 j suggest that ce ramic devices are twice as expensive 
a;-, p last ic dev ices and Hakim et al (501 propose that ninety per cent of all 
in tegra ted c i rcu i ts are plast ic. This poses a prob lem for high rel iabi l i ty 
users of m ic roc i r cu i t s s ince near ly all devices are avai lable in plast ic, 
but not all are avai lable in ce ramic . it is therefore necessary for the 
h igh re l iabi l i ty user to cons ider the use of plast ic devices on the g rounds 
of cos t and avai labi l i ty. Plast ic m ic roc i rcu i t s of fer lower weight and it 
has been suggested that the encapsu lan t which sur rounds the die and w i r e -
bonds makes the device more robust. 
There are two main d isadvantages with plast ic devices : 
(i) Mois ture wh ich is e i ther t rapped inside the package when sea led, 
or wh ich leaks into the package a long the l ead - f r ame , has been 
shown to cause co r ros ion of the die and w i re -bonds . 
re fe rences [43 .48 .50 ,54 ,66 ] . 
(ii) Package integr i ty - If the coef f ic ients of thermal expansion of 
the d ie . l e a d - f r a m e , and mould ing compound are not equal , 
s t resses wil l be set up within the package which may crack the 
die or break the w i r e - b o n d s . 
In order to acce le ra te the fa i lure of devices due to moisture 
co r ros ion it is c o m m o n to per fo rm THB ( temperature humidity bias) test ing. 
The dev ice under test is p laced under condi t ions of 85° 0 /85% rh with a 
st i i t ic biar. of +5v app l ied. The biassing of the device is organised to 
m in im ise iho f jower di.^;sipation of the device so that heat generated o n - c h i p 
41 -
wil l upset the test cond i t ions as l itt le as possible. References[51.52,53] 
p ropose equat ions l ink ing the acce le ra t ion in fa i lure rate to the 
tempera tu re and relat ive humidi ty. Reynolds [35) suggests that equ ipment 
under ben ign cond i t ions of 30° C/30% rh will undergo an acce lera t ion in 
fa i lure rate ot 550 under cond i t ions of 8 5 ° C/85% rh . whilst the 
co r respond ing acce le ra t i on factor for equ ipment normal ly operated under 
uncon t ro l led cond i t ions of 12° C/80% rh is 210. The latter acce lera t ion 
rate is chosen as the rate most appropr ia te to Brit ish Gas "f ield 
c o n d i t i o n s ' . 
Lycoudes 154] conc ludes that new plast ics have virtual ly e l iminated 
the ear ly p rob lems of package integr i ty and that the rel iabi l i ty of p last ic 
dev ices is compa rab le to that of hermet ic dev ices, provided that 
env i ronmenta l cond i t ions are not ext reme. 
Results of acce le ra ted THB test ing are given by Hakim et al [50] who 
repor t that hermet ic ce ram ic devices per form best of a l l . but that the 
ext rapolated fa i lure rates for plast ic devices are acceptable. They 
conc lude that the use of plast ic devices in mil i tary and high rel iabi l i ty 
equ ipmen t is not r e c o m m e n d e d under all cond i t ions , but their use under 
con t ro l l ed cond i t ions migh t be acceptab le . 
Bauer et al [66] r e c o m m e n d that plast ic devices may be used with 
caut ion in h igh rel iabi l i ty app l ica t ions as long as their fai lure rate is 
es t imated to be four t imes worse than equivalent ce ramic devices. 
The resul ts of 8 5 ° C/85% rh tes t ing, thermal shock test ing, and 
tempera tu re cyc le test ing on plast ic packages are reported by Motorola 
(39,40,41). The resul ts Of therma l shock test ing are good, especia l ly in 
re fe rence l40 ) , con f i rm ing that new plast ics have e l iminated early prob lems. 
The resul ts of 8 5 ° C/85% rh tes t ing, re fe rence [39 ] , are more dif f icul t to 
in terpret . For a random fa i lure process it is a valid assumpt ion to 
mul t ip ly the number of dev ices under test by the test dura t ion , which gives 
- 42 
an equivalent number of device hours under test condi t ions. However 
fa i lure duo to co r ros ion is not a random process and should not be analysed 
by this method . The only conc lus ion that can be drawn from the Motorola 
data is that out of a sample of 1717 devices. 18 devices had fai led after 
1008 hours of test ing at 8 5 " 0 / 8 5 % rh. If however this recommendat ion is 
i gno red and the resul ts of THB test ing in re ference[39] a re used to give a 
la i lure rate at 85 ' ' c /85% rh . then at 90% con f idence the fa i lure rate is 14 
i/M hours wh ich agrees well with the results of Hakim et ai [501. If this 
fa i lure rate is ext rapolated to cond i t ions of 70° 0 / 3 0 % rh such as might 
prevai l under Moto ro la ' s acce le ra ted temperature test ing, then the result 
is a fa i lure rate of 0.37 f /M hours as compared with the acce lera ted 
test ing result for plast ic dev ices of 0.2 f /M hours. Al though THB test ing 
resul ts should not be extrapolated to give a fa i lure rate, it appears that 
reasonab le resul ts are obta ined if extrapolat ion is per fo rmed. 
For the above reasons , the resul ts of extrapolated THB test ing should 
be t reated with cau t ion . 
An in teres t ing result is g iven by Motorola (391, in which the fa i lure 
rates of plast ic and c e r a m i c dev ices are given as 0.2 f /M hours and 0.24 
f /M hours respect ively. This suggests that under favourable cond i t ions , 
there is no d i f fe rence between the rel iabi l i ty of plastic and ceramic 
dev ices . This f ind ing is a lso the conc lus ion of Fox [67]. 
The g raph of f igure(8) shows the fa i lure rates of equivalent plastic 
and ce ram ic 8085 m ic rop rocesso rs plotted against ambient temperature. The 
fa i lure rates are ca lcu la ted acco rd ing to MIL 217D and ONET data, using the 
fo l lowing equat ions : 
MIL 21 7D 
'flaikc '" <J 1295 exp [9270 ( 298 T + 358 ) J + 1.785 (3.5.1) 
A - 0.0296 exp 
I 
6373 ( 2 9 8 T + 315 ) 0.533 (3.5.2) 
- 43 -
CNET 
^ 1.575 (3.5.3) 
+ 0.525 (3.5.4) 
The MIL 217D and CNET pred ic t ions for ce ramic devices agree closely, 
but p red ic t ions for plast ic devices d isagree by nearly an order of 
magn i tude . The CNET data which is less pessimist ic than MIL 217D is felt 
to be more accu ra te when cons idered in re lat ion to this sect ion and the 
f ind ings of Motoro la . in fact under favourable cond i t ions , packages may be 
even better than CNET data pred ic ts . 
It is p roposed that the lower fa i lure rate of plast ic TTL devices as 
c o m p a r e d wi ih CMOS is due in part to the heat generated by TTL devices, 
wh ich tends to dr ive off any mois ture present in the package. Since CMOS 
genera tes a lmost neg l ig ib le heat, any moisture present in the package will 
not be dr iven off and mois ture act ivated cor ros ion will proceed at a faster 
rate because of the h igher level of relat ive humidity. 
3.5.3 Recommenda t ions 
1. TTL log ic shou ld be used in p re fe rence to CMOS unless power 
consumpt ion is c r i t i ca l . 
2. There is s t rong ev idence to suggest that hermet ic devices are sl ight ly 
more re l iab le than plast ic dev ices, and it is recommended that ceramic 
packages are used in all but the most cos t -sens i t i ve appl icat ions. 
3. Fai lure rate p red ic t ion of plast ic encapsu la ted devices should be 
pe r fo rmed using CNET data in p re fe rence to MIL 217D. 
44 -
S.^__SCRE.ENING AND METHODS OF IMPROVING FAILURE RATES 
One ot the most c o m m o n ways of improv ing the rel iabi l i ty of individual 
componen ts and e lec t ron ic systems is the techn ique of "bu rn - i n " . The 
device under test is opera ted under normal condi t ions or more common ly 
e levated tempera tu re for a per iod of t ime which is suff ic ient to remove the 
f reak popu la t ion of ear ly fa i lures which occur on the "infant mortal i ty" 
sec t ion of t l ie "bath tub" curve. As components fa i l , they are rep laced and 
on comp le t ion of a success fu l b u r n - i n , all weak components will have been 
rep laced and the fa i lure of componen ts will assume a constant rate due to 
random fa i lures. Techn iques for de termin ing an opt imum bu rn - i n prof i le are 
g iven in re fe rences l55 .56 ] s ince it is important to bu rn - i n for suf f ic ient 
t ime to expose weak componen ts , but fur ther b u r n - i n t ime will be a waste of 
t ime, or in the case of some componen ts , will exhaust some of the useful 
l ife of the componen ts . 
For h igh rel iabi l i ty app l ica t ions , components are avai lable f rom 
manu fac tu re rs which have sat isf ied agreed standard screen ing procedures. 
Such sc reen ing may involve v isual , mechan ica l , and e lect r ica l tests. 
In tegrated c i rcu i t fa i lure rate pred ic t ions in MIL 217D refer to the screen 
level of a device acco rd ing to US MIL STD 883. This screen ing procedure 
was in t roduced in 1968 in o rder to c rea te an economica l ly f e a s i b l e 
s tandard ised in tegrated c i rcu i t sc reen ing flow which would achieve i n -
equ ipmen t fa i lure rates of 0.8 f /M hours and 0.04 f /M hours for c lass B and 
Class A devices respect ively. The s tandard has been modif ied s ince 1968 
and now represents a very tough screen ing spec i f icat ion. An equivalent 
sc reen ing spec i f i ca t ion is covered by Bri t ish Standard BS9400. The 
sc reen ing spec i f i ca t ions for in tegrated c i rcu i ts are covered in detai l by 
Nat iona l Sem iconduc to r s (58]. 
The re lat ion between MIL 217D screen level , sc reen ing spec i f i ca t ion , 
qual i ty factor TTq , and typical relat ive cost is given in t a b l e d 7). 
45 -
Cons iderab le d i f fe rences exist between MIL STD 883 c lass S and BS9400 c lass 
A, for example, c lass S sc reen ing speci f ies non destruct ive w i re -bond pull 
tests, however the two c lasses are broadly s imi lar . The screen ing c lasses 
B and C of MIL STD 883 a re very s imi lar to those of BS9400. and BS9400 
inc ludes a fourth c lass D not covered by MIL STD 883. The screen ing 
requ i remen ts of B39400 are shown in f igure(9) . Consider ing screen ing level 
A. the p rocess begins with a visual examinat ion of the die and w i re -bonds 
under 30 -200 t imes magn i f i ca t ion . The condi t ion A visual examinat ion is 
more s t r ingent than B and speci f ies maximum al lowable deformat ion in 
meta l i sa t ion widths etc. The visual examinat ion is obviously very labour 
intensive and there fore expensive. The device is then encapsula ted and 
baked to stabi l ise it, The integr i ty of the package and w i re -bonds is now 
tested by tempera tu re cyc l i ng , mechan ica l shock, and constant acce lera t ion 
tests. The package seal is tested next by f ine and gross leak tests and 
then the dev ice is tested e lect r ica l ly . In order to detect weak devices, 
the dev ice is b u r n t - i n at high tempera tu re , after which it is tested again 
e lec t r ica l ly . A h igh tempera tu re reverse bias is now appl ied and the 
dev ice is aga in tested e lect r ica l ly . Finally the device undergoes an X- ray 
examinat ion to detect bad w i r e - b o n d s , part ic les inside the package, or bad 
die adhes ion . if all these tests are passed successful ly , the device 
con t inues to rout ine test ing before being re leased. Screen ing levels B. C. 
and D form a s u b - s e t of level A as wel l as having a less st r ingent visual 
examinat ion and b u r n - i n . The only d i f fe rence between screen ing level B and 
C is the f inal b u r n - i n of c lass B devices. It is therefore recommended 
that devices are p rocu red acco rd ing to BS9400 class C at a typical relat ive 
cost of six t imes that of a plast ic device. The equipment is then 
assemb led and sub jec ted to a b u r n - i n at 125°C for 160 hours, in this way. 
c lass C dev ices which survive the assembly and b u r n - i n could be cons idered 
to be equivalent to c lass B devices. Accord ing to t a b l e d 7 ) . this b u r n - i n 
p rocess shou ld inc rease the rel iabi l i ty by a factor of three, making the 
- 46 -
equ ipment an order of magn i tude more re l iable than equipment using plast ic 
dev ices , but at only six t imes the cost. 
As an a l ternat ive to this b u r n - i n p rocedure , Brit ish Telecom b u r n - i n 
some high rel iabi l i ty equ ipment for 500 hours at 85°0 . 
3.6.1 Derat ing and coo l ing 
Reduc ing ine s t resses on a component will increase its life. 
Jensen [57] suggests that componen ts should be derated accord ing to the 
fo l lowing gu ide l ines for h igh rel iabi l i ty appl icat ions : 
Resistors - dera te power to 0.5 rated 
Electro ly t ic capac i to rs - mainta in co re temperature below 70°0 
S e m i c o n d u c t o r s - Power dera t ing = 0.3 
Current dera t ing = 0.5 
Vol tage dera t ing = 0.6 
Wherever poss ib le , the junct ion temperature of semiconductor devices should 
not exceed the max imum values of tabie(18) . 
The rel iabi l i ty of CMOS and l inear in tegrated c i rcu i ts can be improved 
by dera t ing the power supply vol tage. This is not possible with TTL and 
NMOS logic wh ich must be suppl ied with +5V +5%. A 5% reduct ion in supply 
vo l tage would inc rease the rel iabi l i ty sl ight ly, but at the expense of 
reduced noise immuni ty and to le rance of power supply voltage f luctuat ions. 
S ince the fa i lure rate of in tegrated c i rcu i ts is approximately 
exponent ia l ly dependent on tempera tu re , it is important to min imise the 
junc t ion tempera tu re . There fo re c i rcu i t boards should be mounted well away 
f rom power supp l ies , heat sinks etc. and provided with as much cool ing as 
poss ib le , in many cases , the use of fo rced coo l ing is advantageous and can 
d ramat ica l l y reduce the fa i lure rate of equ ipment . 
The power d iss ipat ion within a device is also dependent on the output 
load ing . Jensen [57] r ecommends that the fan -ou t of digi tal c i rcu i ts and 
- 47 -
ihe load ing of l inear c i rcu i ts is dera ted by a factor ot at least 0.8. 
3.7 SOFTWARE AND TRANSIENT ERRORS 
Before d iscuss ing mal func t ions in digi tal con t ro l le rs , it is necessary 
to def ine the terms 'e r ror " and "fault". They are def ined by Anderson and 
Lee [60] as , 
Error - Part of an e r roneous state that const i tutes a d i f fe rence f rom 
a val id state. 
Fault - An e r ro r in a componen t or the des ign of a system will be 
re fe r red to as a fault in a componen t or system. 
An er ror can thus be seen as the mani festat ion of a fault, a s ingle fault 
p roduc ing one or more e r ro rs . 
Faults in d ig i ta l con t ro l le rs are e i ther permanent (hard) or t ransient 
(soft) . Permanent faults a re normal ly easy to d iagnose and repair and are 
caused by the fa i lure of componen ts and software. Unless a software fault 
is fundamenta l to the opera t ion of the cont ro l le r , it is usually possib le 
to recover f rom the fault by the techn ique of "except ion hand l ing" , 
desc r i bed iaier. Carefu l test ing should reveal permanent software faul ts, 
however hardware may fail at any t ime, as descr ibed in sect ion 3.4, and 
redundancy techn iques are requ i red to protect against such fa i lures. 
A l though it is d i f f icu l t to predic t accurate ly the fa i lure rates of 
hardware as demons t ra ted in sect ion 3.4, suf f ic ient data exists to make an 
es t imate which wilt be co r rec t wi th in an order of magni tude or so. 
Trans ient faults are much harder to deal with s ince they are by their 
very nature of short du ra t ion , and may be caused by many di f ferent sources. 
II is not always poss ib le to detect the cause of a transient fault, 
a l though a par t icu lar c lass of t ransient fault may be in jected into a 
system to measure the s y s t e m s fau l t - to le rance to that c lass of fault. 
- 49 
in ternal opera t ion of most m ic rop rocessors is dynamic and rel ies 
on stored cha rge , so it is reasonable to expect that smal ler 
dev ice geomet r ies in m ic rop rocessors will increase their 
sensit iv i ty to alpha rad ia t ion. 
Motoro la [61] descr ibe the des ign procedure for . a 64k 
dynamic RAM with par t icu lar at tent ion to the effect of a lpha 
rad ia t ion . Trans ient e r ro r rates as high ^s 500 f /M hours are 
quoted for ini t ia l p roduct ion units in 1979. Transient er ror 
rates for cu r ren t p roduct ion devices are quoted as 22 f /M hours, 
be ing 22 t imes greater than permanent fa i lure rates. 
(iii) Design faults - The physical cons t ruc t ion of a p iece of equ ipment 
can give r ise to t rans ient faults due to sc reen ing , layout, or 
decoup l i ng p rob lems. T iming and logic threshold faults are a 
lur ther c lass of fault and are somet imes af fected by temperature , 
mak ing a p iece of equ ipment mal funct ion over a cer ta in 
tempera tu re range , whi lst it works perfect ly at other 
tempera tu res . For this reason it is important to "fes-f" 
equ ipment over its comple te operat ing temperature range. 
l iv) Sof tware - Trans ient software faults are usually the result of 
p r o g r a m m i n g e r ro rs which may be either incorrect a lgor i thm 
spec i f i ca t ion or cod ing e r ro rs in mach ine language or high level 
language. Except ional ly faults are caused by "bugs" in the 
c r o s s - a s s e m b l e r or compi le r . Methods for improving software 
rel iabi l i ty are g iven in sect ion 5.4. 
(v) Env i ronmenta l e f fects - Often digi tal cont ro l le rs are sited in 
harsh e lec t r i ca l env i ronments which may cause transient faults 
due to noise on the e lec t r ica l supply or high e lec t romagnet ic 
f ield s t rengths . LSI c i rcu i ts are part icular ly prone lo 
e lec l r i ca l . i n te r fe rence and In the worst env i ronments , further 
rnoacures may be requ i red other than at tent ion to ear th ing . 
- 50 -
screening, and provision of a "clean" power supply. 
A powerful tool for the correction of transient faults is the 
provision of fault-tolerant software which is written to check for faults 
and provide any necessary correction. 
3.8 SUMfv1Ar<Y 
Often, as is the case for the experimental controller, a design 
requirement is that a piece of equipment does not exceed a certain failure 
rate. It is also necessary to predict the failure rate of equipment so 
that alternative designs may be compared. The previous chapter has shown 
that it is difficult to predict accurately the failure, rate of equipment. 
Components are assumed to fail according to the "bath-tub" curve and it is 
assumed that during the useful life of the component, that the failure rate 
IS constant. in order to screen semiconductor equipment and to produce 
failure rate data, it is essential to accelerate the failure process and 
the Arrhenius acceleration equation is universally used when calculating 
the accoioraiion m failure rate due to a rise in temperature. 
The wide variation in failure rate prediction data is discussed in 
section 3.A ana the comparison of different predictions with field data 
suggests that some data sources are preferable to others. The recommended 
failure rate prediction data source for each type of component is given at 
the end of the section. For most components, the CNET data [38] is to be 
preferred. 
A comparison of device technologies shows that TTL is more reliable 
than CMOS, which is probably due to the high input impedance of CMOS which 
makes it very sensitive to moisture, and the higher power dissipation of 
TTL which is sufficient to dry out the integrated circuit package. Unless 
design considerations require the use of CMOS in low power applications, it 
is recommended thai TTL is used in preference to CMOS. 
51 -
There is considerable disagreement as to whether plastic encapsulation 
is worse than hermeiic encapsulation. The findings of Motorola suggest 
that there is no difference between the two methods of encapsulation, but 
ail other literature proposes that plastic devices should be used with 
caution in high reliability applications. It Is clear that improvements 
have been made in the techniques of plastic encapsulation, and that plastic 
devices are only significantly worse under conditions of high humidity. it 
if. recommended here that plastic devices are only used in the most cost 
sensitive applications. 
Components should not be operated at their maximum ratings and 
guidelines for derating components are given in section 3.6. Since the 
failure rate of components is exponentially dependent on temperature, the 
provision of forced cooling will dramatically reduce the failure rate. 
Although considerable importance is attached to the prediction and 
prevention of permanent failures, it is reported that transient errors are 
much more common [2.61.78] and it is important to protect against this 
class of fault. The governor controller is tolerant of most classes of 
transient fault. Some of the protection is provided in hardware and is 
transparent to the software, whilst the rest of the protection is provided 
by software. Software fault-tolerance is a powerful tool for the detection 
and correction of transient errors. 
- 52 -
CHAPTER 4 
PREPICTION OF RELIABILITY IMPROVEMENT DUE TO FAULT-TOLERANT 
" TECHNIQUES 
METriOuS OF EXPRESSING RELIABILITY 
Unlike the physical parameters of an electronic device such as 
resistance, capacitance, voltage etc. which are easy to use and to measure, 
me reliability of a component is a very difficult quantity. The whole of 
reliability engineering relies on the use of statistics and. therefore great 
caution must be exercised, since statements are made concerning statistical 
quantities rather than physical quantities. 
The failure of electronic components is shown by Cluley [1] to fit the 
Poisson distribution. The reliability of a system is defined as : 
The probability that the system will perform Its required function, 
under stated conditions, for a stated period of time. 
Normally a system is required not to fail, so the reliability of a system 
Is given by the probability that no failures are observed during a given 
time period. The probability is given by : 
P(0) ^ exp ( - ^ t ) (4.1,1) 
where : t = time interval 
A = failure rate of the component 
P(0) = probability of no failures 
which can be rewritten as : 
R = exp ( -Xt) where ; R = reliability of the component (4.1.2) 
In order to calculate the reliability it is necessary to know the 
failure rate of the component. X . which can be measured by means of life 
testing or may be predicted as described in chapter three. 
53 -
The concept of "mission time" is useful when it is required to predict 
the probability that a piece of equipment will operate successfully over a 
given time period. The probability of success is given by : 
P -• exp ( - A T ) where : T = mission time (4.1.3) 
Such an analysis is often used in space and defence when it is required to 
calculate the probability of a successful space mission or the probability 
that a missile will hit its target. 
In an industrial environment, mission time is a less useful concept as 
it is normally required to predict how long a piece of equipment will 
operate before failing. The mean time to failure. MTTF. is defined by : 
,00 
MTTF '-^  
o J 
R(l) dt (4.1.4) 
exp (-At) dt 
_ l _ 
If the MTTF is substituted for the mission time for a piece of equipment, 
then the reliability is given by : 
R - exp ^~A-j^ ) 
= exp (-1) 
= 0.37 
This means that there is only a 37% chance that a piece of equipment will 
operate for a time equal to the MTTF. It is for this reason that the MTTF 
should be used carefully. 
- 54 
4.P EFFECT OF REDUNDANCY ON RELIABILITY 
For n units in series, each having failure rates Ai , X i . As 
system reliability is given by : 
etc.. the 
R = exp - ( A / + A i + A j + )t (4.2.1) 
The failure rates may simply be added together to calculate the total 
system failure rate. This is the case for most failure rate predictions 
and the failure rate of the governor controller is calculated by summing 
the failure rates of the individual components. 
The simplest form of redundancy is a parallel standby system where 
either component can fail without the system failing. 
R = e^^ 
Assuming that the two components fail independently, the probability of 
failure is given by . 
P(f) = ( i -R )^ where P(f) = probability of failure 
since : P(s) = 1 - P(f) where P(s) = probability of success 
the reliability of the system is given by : 
R = 2R - R (4.2.2) 
Cluley [1] shows that the MTTF of the system is equal to 
MTTF = 3 _ 
l A 
which is only 50 per cent more than the MTTF of a single channel. 
55 
If a TMR system having perfect voters is considered, the reliability 
is given by : 
R - 3R'' - 2R^ (4.2.3) 
and the MTTF is given by 
MTTF 5 _ 
6 X 
In this case the MTTF is less than that of a single channel. At . first 
sight this would suggest that a TMR system is inferior to a duplicated or a 
single channel, however a TMR system can be shown to be more reliable if 
repair or the concept of mission time is considered. 
A graph is plotted in figuredO) of the reliability of a simplex and a 
TMR system against At. the normalised mission time. It is seen that there 
is a crossover at which point the TMR system ceases to be more reliable 
than the simplex system. The position of this point may be calculated by 
equating the simplex and TMR reliabilities according to : 
R = 3R^ ~ 2R^ 
The solution R-0.5 corresponds to At=0.693. The crossover point which 
ocurrs before Xt-1 explains why the MTTF of a TMR system is worse than 
that of a simplex system. For a system without repair, the useful life of 
a TMR system is limited to At=0.693 or less. 
4.3 EFFECT OF MAINTENANCE ON A SYSTEM 
The previous sections have not considered repair and a system was 
considered to have exceeded its useful life once it had (ailed. This is 
typical of space and defence equipment. Systems whicti are repaired when 
they have (ailed will now be considered; this situation can be considered 
typical of industrial equipment such as the governor controller. Since 
56 -
systems are repaired between failures, the term MTBF (mean time between 
failures) is used to replace the term MTTF. The mean time to repair of a 
failed system is given by the MTTR, The availability of a system is 
defined as : 
A MTBF (4.3.1) 
MTBF + MTTR 
The availability may be interpreted as a probability that the system is in 
working order at any instant. The concept of availability is useful when 
planning the maintenance of equipment. If the MTBF of a piece of equipment 
is known, and it is required to achieve a certain level of availability, 
tfien the necessary MTTR to achieve that level of availability may be 
calculated. 
4.3.1 Redundant systems 
The iniroduction of redundancy into a system, together with repair, 
allows considerable improvements to be made in equipment reliability. For 
example if one channel of a TMR system fails and is repaired before a 
further channel fails, then the system will operate without interruption. 
In order to achieve interruption free operation, it is essential to know 
when components fail so that they may be replaced before further failures 
occur. 
For a repairable system with n units of which one is a spare, the 
Markov failure model is given below : 
n-1 out of n 
state 2 
n out of n 
state 1 
failed 
state 3 
57 
where : Ai = failure rate of a single unit 
Az= repair rate 
c = coverage 
The coverage is a number between zero and unity and is a measure of the 
^ proportion of faults which are covered by the redundancy in the system, 
where unity is equivalent to complete coverage. The transfer rates between 
states are given by : 
dP, (t) - - nAiPi (t) + AzPi .d) 
dt 
dP, (t) = nAicP; It) - XiPaCt) - (n - l )A ,P i ( t ) 
dt 
dP,(t) = nAi(1-c)Pi(t) + (n-l)AiP2.(t) 
dt 
he state transition matrix is therefore given by 
(4.3.2) 
(4.3.3) 
(4.3.4) 
- nAi 
A i . 
0 
nAic 
- [Ax+ (n- l )Ai l 
0 
nAi ( l -c ) 
(n- l )Ai 
0 
The solution of these equations in terms of the MTFF (mean time to first 
failure) is given by Dhillon and Singh [71] as : 
MTFF [1 0] l-Q]' U where : U = unit matrix (4.3.5) 
Q is the reduced state transition matrix which is given by : 
(Ql = 
nAi 
Aa 
nAic 
-[A,+ (n- l )A, j 
58 -
Hence 
l - Q l = J 
[ X 3 + ( n - l ) X i ] nXic 
n Ai 
The MTFF is given by : 
MTFF = [1 0) [-Q] 
X2. -t-(n-1)Xi -*• nXiC 
ntn-DA,' ' + n X,Xi(1-c) 
(4.3.6) 
For X i<< A i this equation can be simplified to 
MTFF 
n(n-1)X^ '1 + (1-c) Xz' 
(n-1) Xi. 
(4.3.7) 
Which is the same equation as that given by Arnold [72]. 
Equation{4.3.6) may be checked for a TMR system with no repair which 
has 100 per cent coverage using the values : 
X i = 0 
c = 1 
n = 3 
this gives ; 
MTFF ^ _5_ 
6X 
which agrees with the expression for MTFF derived previously. 
The concept of coverage as applied to software is easy to define. 11 
i!. Ilio p rdpor l ion ol liiiill;; Itu'il c i in t)0 doloclod /ind rf.icovori.-d Irorri. ' I or 
har( jw;ir(! it ):. fxoposed iha i the coverage is defined by : 
- 59 -
c- failure rate of protected circuitry 
failure rate of protected + unprotected circuitry 
The unprotected circuitry iricludes all common circuitry which contributes 
to common mode failures. 
4.4 METHODS OF EXPRESSING IMPROVEMENT 
Any improvement obtained in the reliability of a system can be shown 
by an improvement in the availability or the reliability of the system for 
a given mission time. In an industrial environment it is useful to 
consider any improvement that can be made to the MTTF of a system. The 
MTTFiF (mean time to failure improvement factor) is defined as : 
MTTFIF MTTF (fault tolerant! 
MTTF (non fault tolerant) 
Hence the MTTFIF for a system with n units of which one is a spare is given 
by the modified equation(4.3.7) : 
(4.4.1) MTTFIF X i / X i 
n (n - l ) [1 + (1 -c ) 
(n -1 ) X i 
Hence the MTTFIF for a TMR system with repair as shown by Pearson and 
Preece [731 is given by : 
MTTFIF = A 2 / X 1 (4.4.2) 
6 "1 + k 
6 + 2k ^' . 
where : X 2 = repair rate 
X I = failure rate of a single unit 
Xv 
k = A, 
A v - failure rate of voters 
- 60 -
A graph of MTTFIF against for different k factors is shown in 
f igured 1 ). For large values of k, it is seen that little improvement can 
be made in the MTTF as the ratio is increased. For small values of 
k. large improvements can be made to the MTTF as the ratio A z / X , is 
increased. For a given value of k it is seen that the curve reaches a 
plateau, after which further increases in A x / X i cause little extra 
improvement in MTTF. The point at which this plateau is reached depends on 
k and this point should be used to determine the minimum time to repair. 
Any further improvements in the repair time will be wasted, since the MTTF 
cannot be improved much more. 
These findings can be summarised by stating that in order to achieve 
imorovement in the MTTF of a system by repair, the fault coverage must be 
high. 
A similar approach can be used to determine the improvement to be 
gained by using SEC/DED Hamming code protected memory. In this case there 
are n RAM chips of which one is a spare. The coverage is calculated as 
previously defined : 
c failure rate of RAM chips 
failure rate of RAM chips + Hamming code circuitry . 
4.5 SUMMARY 
iwo types of reliability analysis have been discussed. The first is 
concerned with predicting the probability of failure free operation of a 
non-repairable system for a given mission time. This analysis is typically 
used in space and defence applications. The second type of analysis is 
directly applicable to the governor controller. The individual failure 
rates are added together to give the total failure rate. The effect of 
maintenance and repair is then considered and the MTTF of the equipment is 
calculated. in order . that a repairable redundant controller may be 
- 61 -
compared with a non-redundant system without repair, the concept of MTTF 
improvement factor is introduced. Consideration of the MTTF improvement 
factor allows the optimum repair time to be calculated. 
- 62 -
CHAPTER 5 
TECHNIQUES FOR IMPROVING A SYSTEM'S FAULT-TOLERANCE 
5.1 LEVELS OF FAULT-TOLERANCE 
A structured approach to fault-tolerance implementation is proposed by 
Halse et al [74]. Fault-tolerant techniques are assigned levels of 
recovery as shown below. 
Emergency fail-safe 
or shut down 
level 3 
Exception handling 
and recovery routines 
level 2 
Hardware fault-
tolerance 
level 1 
At the lowest level, level 1, complete recovery is provided from a 
limited range of fault types. Recovery is embedded in the hardware design 
of the system and in the experimental controller consists of recovery from 
voting and memory errors. Level 2 provides a more limited recovery from a 
wider range of faults. Any state of the controller which is outside its 
specification is termed an "exception". As the processing of data 
proceeds, exception handling routines check for errors which may be caused 
by external stimuli or design faults, and initiate recovery from them. 
Recovery routines perform the logging of errors and identical fault 
recovery may be used to recover from different types of fault. At the 
highest level, level 3, limited or degraded recovery, such as a fall-safe 
emergency shut-down, is provided from a wide variety of fault types. 
A structured approach to assigning levels of fault recovery is useful, 
since if a given level fails to recover from a fault, recovery is passed up 
to the next level until recovery or soft failure is successfully completed. 
- 63 -
5.2 DESIGN TOOLS 
Two commonly used design tools for assessing the reliability of a 
piece of equipment are FMECA and FTA which are defined below. 
5.2.1 FMECA 
Fault mode, effect and criticality analysis is a formal design 
exercise where each component in a system is considered and a failure rate 
is assigned to each mode of failure. For instance a resistor might fail 
open circuit 25 per cent of the time and fail out of tolerance for 75 per 
cent. The consequence of resistor failure is listed, together with a 
criticality rating. Failure open circuit might result in failure of the 
circuit, whilst parameter drift might result in a degraded but acceptable 
performance. An example of a FMECA is given below : 
Item Failure mode failure rate 
f/M hours 
effect criticality 
rating(1=low 
3=high) 
resistor Rl open circuit 0.002 no output 
from circuit 
3 
resistor Rl parameter 
drift 
0.006 reduced 
output 
1 
capacitor CI short cct. 0.002 no output 3 
capacitor CI open circuit 0.002 reduced gain . 
at high freqs. 
2 
Hence the failure rate at each criticality rating may be determined, rather 
than assuming all failures are catastrophic. This analysis is well suited 
to analogue circuits and data sources such as NCSR [34] give the percentage 
failure for each mode. The criticality analysis is of little use for 
systems containing many digital integrated circuits. Failure rate data 
gives no information about failure modes and it is assumed that any failure 
within complex integrated circuits results in total failure of the device. 
- 64 -
In this case, the FMECA is shortened to FMEA, fault mode effect analysis. 
If the failure rates of all the components in the system are tabulated, 
then components having a high relative failure rate can be identified and 
corrective action taken. An example of a FMEA is given in section 7.3 when 
the microprocessor was found to dominate the total failure rate. 
5.2.2 FTA 
Fault tree analysis is complementary to FMECA and . FMEA. Instead of 
being a bottom-up process it is a top-down process. The analysis starts by 
considering failure modes of the complete system and then working downwards 
to identify the part failures which could generate such system failures. 
Conventional logic symbols are used to combine events. An example of a FTA 
for a pressure trip is shown in f igure(12). If the pressure exceeds the 
preset value it is required to trip the process. To prevent spurious 
trips, two pressure transducers are used, both of which must signal 
overpressure before action is taken. If probabilities. are assigned to each 
fault, then the probability of an erroneous trip may be calculated. 
5.3 HARDWARE 
When considering the failure rate of hardware It is useful to remember 
that using less hardware will decrease the failure rate; since there are 
fewer components to fail. A well managed and designed piece of equipment 
is likely to be more reliable than a hurried design where the main 
acceptance test is whether the equipment works or not. 
Several methods are now proposed for improving the fault-tolerance of 
digital controllers Many of these methods are used on the governor 
controller described In chapter eight. 
5.3.1 Watchdog timer 
A watchdog timer is probably one of the simplest and most effective 
- 65 
devices that can be added to a system. It will not cure all ills, but 
should prevent the system remaining in a crashed state. In its simple 
form, the watchdog consists of a monostable which is triggered under 
software control by the syatem it is protecting, at time intervals less 
than the pulse width of the monostable. In this way the monostable 
generates a constant pulse. If the retriggering pulse, is not received, 
then after a delay set by the monostable. it times-out and a reset pulse is 
generated for the system. it is important that the watchdog is retriggered 
once only per program cycle, and this is usually performed on each pass 
through the main control loop. There is a small probability that the 
controller can stick in an erroneous loop, continually retriggering the 
watchdog, and the watchdog will fail to reset the system. To protect 
against this more complex watchdogs can be used. As the controller passes 
in an orderly sequence through a set of Instructions, a signature or key is 
output to the watchdog which is only retriggered if it receives this key in 
the correct sequence. 
5.3.2 Snake 
A snake is a long sequence of NOPs or restart instructions which 
occupy any unused address space. An erroneous jump into this address space 
causes the program to "slide" down the snake until vectored recovery is 
executed at the end of the snake. Unused address space is connected so 
that an instruction fetch from non-existent memory fetches a no-operation 
or software restart instruction. A software restart instruction is used to 
jump back into the program. If no-operation instructions are used, then 
the controller will execute a whole series of NOPs until the program 
counter arrives at a valid address range. The 8085 microprocessor ideally 
lends itself to this technique as described In chapter eight. if non-
existant memory is pulled high, an Instruction fetch will read FF and a 
RST7 will be executed. If the microprocessor does not support software 
- 66 -
restarts and if full memory decoding is used, memory selects corresponding 
to non-existant memory can be used to force the instruction for a NOP or a 
restart onto the data bus. 
5.3.3 Power supply levels 
Many logic circuits are only guaranteed to function over a + 5 per 
cent power supply variation. Power supplies should be set to the middle of 
this range to give maximum protection against noise. It is important to 
detect transient undervoltage of the power supplies and to reset the 
system, s ince transient undervoltages have been observed to crash 
microprocessors . A dip in the +5\/ supply to a microprocessor by only IV 
was found to c r a s h It and such a dip was too fast to be detected by the 
power -on- reset circuitry. 
5.3.4 Output verification 
After an output function has been performed, the output is read back 
and is compared with the value which has just been output, in this way it 
is possible to test for failure and transient faults in the output 
circuitry. 
5.3.5 Component redundancy 
Critical circuit components can be duplicated with a spare. This has 
both cost and reliability advantages over a TMR system, but suffers from 
the severe drawback that with only two components it is difficult to 
determine which is in error. A TMR configuration overcomes this difficulty 
by taking a majority vote. Chapter four has shown that with maintenance, 
large improvements can be made in the reliability of equipment using a TMR 
configuration, however care should be taken not to exceed the "useful life" 
of a TMR system without repair. Voting can be performed in software or by 
the logic circuit of figure(5). 
67 
5.3.6 Memory protection 
The addition of an extra parity bit to memory allows errors to be 
detected as long as an odd number of bits is in error, but it is not 
possible to correct errors. A better type of protection is the S E C / D E D 
Hamming code which will correct single bit errors and detect double bit 
errors. The disadvantage with this code is that the memory word length is 
increased since check bits must be stored as well as data bits. 
For small systems the coding c a n be performed in ROM as described in 
chapter eight, but for larger word lengths there are several VLSI circuits 
available which implement S E C / D E D Hamming code on 16 bit word lengths, 
re ferences (59.79.80], For small systems, the main benefit is to be gained 
from the correction of transient errors, however large systems benefit in 
addition by the correction of permanent errors, 
5.3.7 Temperature 
As shown in chapter three, the failure rate of semiconductors 
inc reases exponentially with temperature. It is therefore essential to 
keep the system temperature as low as possible and to use forced cooling if 
necessary . 
5.4 SOFTWARE 
h 
Structured programming is essential in order to acieve reliable 
software. The software Is divided into easily managable modules which are 
written and debugged separately and then linked together to form the 
complete software package. For small systems, having a program size of a 
few kilobytes it is probably better to write programs in assembler. If 
assembler is used, each machine instruction is defined by the programmer 
and ID under better control than code produced by a high level language. 
B e c a u s e of the complex nature of recovery routines and the difficulty in 
separating hardware and software in a small controller, it is better to 
- 68 -
write recovery routines in assembler . 
As more complex microprocessors such as 16 bit devices are used and 
the program size is increased above a few kilobytes, it becomes 
increasingly difficult to write in assembler. There is a trade off between 
writing in a high level language which is easier to document and modify at 
a later date probably by another programmer, and writing in assembler which 
is more efficient and where the code produced is under tighter control. It 
is suggested that for large software packages, only the recovery routines 
are written in assembler and the rest is written in a high level language 
such as P a s c a l which offers structured programming. The machine code 
produced by the high level language should be examined to ensure that it 
meets the high reliability requirements of the controller. 
5.4.1 Exception handling 
Any abnormal response In a controller is termed an "exception" which 
may be caused by transient or design errors in either hardware or software. 
Exception handling is a powerful technique, implemented in software, where 
the software continually checks itself for errors and initiates recovery if 
any are detected. A simple form of exception handling would be to test the 
input values read into a controller and reject them if they are outside a 
reasonable range. When exceptions are detected, techniques such as "roll-
back" are useful in executing recovery. Roll-back is a form of time 
redundancy which repeats the block of code in which an exception has been 
detected. in this way it Is not necessary to restart the controller and 
only a small bloci< of code need be repeated. The recovery blocks described 
in chapter eight are a crude type of roll back and are illustrated in 
figure(13). The program is divided up into a number of routines. E a c h 
routine inputs a few values, performs some calculation or control action, 
and then continues onto the next routine. Ideally routines pass few 
variables between e a c h other which should be contained in microprocessor 
- 69 -
registers, s ince the recovery blocks of chapter eight only restore register 
status following a c r a s h , and make no attempt to restore RAM contents, 
which could be included. 
As shown at the end of e a c h routine, a recovery block is generated. 
The system is shown to c rash after routine 4. so a recovery block is 
executed which c a u s e s routine 4 to be repeated before passing onto routine 
5. 
5.5 S E L F - T E S T I N G 
Self-testing or health monitoring of controllers can be performed in 
spare execution time and is carr ied out as a background task which should 
not interrupt control of the process . Chapter four has shown that large 
improvements can be made in the reliability of equipment if the MTTR is 
reduced. In order to repair equipment, it is necessary to know that it has 
failed which means that self-testing is essential in redundant systems 
where the redundancy will often mask component failures. 
Self- testirig can be divided into two groups : 
(i) Diagnostic - When an exception is detected, the cause should be 
determined and transient faults distinguished from permanent 
faults which require to be repaired. Fault logging may point 
towards a suspect component or area of bad design if repeated 
transient faults are logged. 
(ii) Preventative - At regular intervals the system modules are 
tested. Such tests are a checksum test of ROM. test of RAM. 
input/output circuitry etc.. In the particular c a s e of gas 
regulator valves where the detailed response characterist ics of 
ihe system are known. It might be possible to predict mechanical 
valve failures. such as a sticking valve, by Injecting a 
disturbance into the system and analysing the response. 
- 70 
To a limited extent a microprocessor can test itself. Firstly a 
kernel of essential instructions are verified which can then be used to 
test other instructions. This technique is described by Hunger [75] with 
particular reference to the 8085. The self-testing sequence is most 
efficiently designed with reference to the manufacturers gate model of the 
microprocessor and a coverage of 60% is quoted for d . c chip faults. 
The need for board testing equipment c a n be eliminated if boards are 
designed to test themselves. Daniels and Fasang (761 describe an 8085 
microprocessor board that will test itself. The self-test feature can be 
used as part of the manufacturing process as well as being useful in 
installed equipment for the sel f -diagnosis of faults. 
If an error is delected during self-testing, an attempt can be made to 
reallocate components such as RAM and control of the process could 
continue. Such a technique is called "graceful degradation". If the fault 
is too ser ious to continue, the process can be made to fail safe. The 
provision ot telemetry to the controller, used for fault reporting, can 
achieve very short MTTRs. If no telemetry is available, faults must be 
logged as they occur , and can then be dealt with as part of routine 
maintenance 
Faults in redundant systems are often easier to diagnose . because 
failure in a non-redundant system will just cause it to c e a s e operation. 
If a redundant system can continue to operate in the presence of a fault, 
it is possible to give a detailed diagnosis of the fault which will reduce 
the repair time of equipment and skill required. 
5.6 SUMMARY 
In a small controller such as the governor controller, it is difficult 
to separate the hardware from the software. A structured approach to the 
implementation of fault-tolerance by assigning fault recovery levels 
- 71 
ensures that most c l a s s e s of fault will be detected and recovery executed. 
A valuable tool for identifying the most unreliable parts of a system 
is FMECA which was used on the governor controller when deciding where it 
was necessary to incorporate fault-tolerance. The complementary FTA is 
d iscussed for the sake of completeness, and was not used when designing the 
governor controller because FMECA was found to be more suitable. 
Hardware techniques such as a watchdog timer, snake, power supply 
undervoitage detection, output verification, memory protection, and 
component redundancy are all used on the governor controller. 
Exception handling routines are initiated by hardware on the governor 
controller and software fault-tolerance is used in selecting the best 
pressure information from the redundant transducers. Much of the governor 
controller will test itself, and the nature of all transient and permanent 
faults is logged. At regular intervals a self-test is performed on the 
solenoid valves and the self-test software could be expanded to test the 
complete system. 
- 72 -
CHAPTER 6 
E F F E C T OF SYSTEM ARCHITECTURE ON RELIABILITY AND COST 
6.r CHOICE OF MICROPROCESSOR 
The first decision to be made when choosing a microprocessor is the 
processing power required, that is whether a single chip, 8 bit, or 16 bit 
device is required. 
6.1.1 Single chip microprocessors 
If the process consists of a single control loop and little arithrnetic 
is involved, it is likely that a single chip device will have sufficient 
processing power. Most single chip microprocessors contain internal read 
only memory (ROM) and random a c c e s s memory (RAM) which makes the sharing of 
memory, Hamming code protection, and the provision of a spare copy of ROM 
impossible unless the single chip system is expanded. If a FMEA of the 
controller snows that the failure rate is too high, it is necessary to 
incorporate some form of redundancy according to the guidelines of section 
6.3. Failure of the controller could be detected and a standby switched 
in. Whilst this is satisfactory for permanent . faults, it offers little 
protection against transients, s ince with only two controllers it is 
impossible to arrive at a majority decision should the controllers (jiffer. 
It is for this reason that the TMR configuration is recommended. Voting in 
a single chip TMR system could be either in hardware or by software. 
If the voting were performed in software, a very reliable controller 
consisting of only a few components could be constructed. 
6.1.2 Eight bit microprocessors 
The design of the controller reported in chapter eight is an example 
of a redundant 8 bit microprocessor controller where most of the redundancy 
is implemented in hardware. Although the hardware techniques described 
- 73 
could be used with most 8 bit microprocessors such as the Z80, MCM6800 
etc.. the 8085 hardware is ideal for a redundant controller. The provision 
of serial input/output pins removes the need for three separate UARTS and 
the provision of four maskable and one non-maskable interrupts removes the 
need for an interrupt controller. Several interrupts are required because 
hardware exception handling routines are called by interrupts. The 6800 
microprocessor is less suitable because it only has one maskable and one 
non-maskable interrupt, and has no serial input/output pins. The 
instructions for NOP and SWI are not ideal for implementing a "snake". 
Possibly a better microprocessor than the 8085 would be the NSC800 
which is identical in hardware to the 8085 except that no serial 
input/output pins, are provided. The instruction set is compatible with the 
8085 and consists of the more powerful Z80 instruction set: 
It much of the redundancy were to be performed in software at the 
expense of processor throughput, it is possible that. different 
microprocessors would be more suitable. 
6.1.3 Sixteen bit microprocessors 
Modern 16 bit microprocessors are really outside the definition of 
"small" controllers and are almost as powerful as mini-computers. It is 
probably better to apply more of the redundancy in software because of the 
processing power and increased hardware and software complexity. 
The Texas 9900 microprocessor differs from other 16 bit devices 
because it does not perform calculations on internal registers, but oh a 
block of registers held in RAM which are pointed to by the internal 
"workspace ppinter". Thus if the workspace pointer is corrupted, all 
registers will be addressed incorrectly and corruption or permanent failure 
of RAM will c a u s e register errors. 
Voting at bus level with 16 bit microprocessors is not to be 
recommended because of the complexity of the hardware and the number of 
- 74 -
signals. instead it would be better to arrange the three microprocessors 
as a master and two slaves sharing common memory. Communication between 
the processors would then be by the common RAM and the voting would be 
mainly performed in software. 
It is likely that 16 bit microprocessors will have a lot of RAM which 
will probably be shared a s well. The RAM should be protected and it is 
suggested that 16 bit S E C / D E D Hamming code integrated circuits could be 
used to protect the RAM. references 159,79,80] instead of the circuitry 
proposed in chapter eight. 
Large software packages will be required for a 16 bit controller which 
are better written in a high level language. Assembler should only be used 
as necessary and for critical a reas such as voting and recovery routines as 
d iscussed in chapter five. 
In high reliability applications, 16 bit microprocessors should only 
be used if an 8 bit device is not powerful enough, or if the addition of a 
maths processor to an 8 bit system is less reliable than a 16 bit system. 
The increased complexity of 16 bit systems over 8 bit systems leads to 
higher failure rates, 
6.2 MAJORITY VOTING 
Majority voting in a TMR system can either be performed in hardware or 
software, the choice of which is influenced by many factors. 
6.2.1 Hardware 
For a small controller using single chip microprocessors and which 
only has a few outputs, it is easier to vote on the outputs in hardware. 
Discrete logic can be used, connected as shown in figure(5) or sevei-ai 
channels of voting and error detection could be contained in a FPLA as 
descr ibed in chapter eight. If the failure rate of the system is 
approxlhnated to that of the voter, then a low failure rate is predicted. 
- 75 -
typically 0.02 f/yr as compared with 0.5 f/yr for a single microprocessor, 
according to MIL-217D. 
Voting on inputs is probably better performed in software, although if 
the total software package is small , the fault coverage will be reduced if 
the extra fault-tolerant software becomes too large and the system 
reliability will be reduced. 
6.2.2 Software 
I h r e e systems can be connected together in a ring, whether they are 
single chip or 16 bit microprocessor based, as shown in figure(6). E a c h 
microprocessor can communicate with its two neighbours. If a processor 
gets no response from one neighbour, but can communicate with its other 
neighbour, then the first neighbour must be in error. At any time one 
microprocessor must be in charge and act as the master and the other two 
microprocessors act as s laves. The master is however checked by the slaves 
so that if the slaves are not interrogated by the master within a set time, 
it is concluded that the master has failed and one of the slaves takes over 
as master. Communication between processors could be either by shared RAM 
which is quicker and is recommended for 16 bit devices, or could be by a 
parallel or serial link. S ince single chip devices contain their own RAM. 
it would be better to communicate via a parallel link. 
Software communication and voting routines should not use up too much 
of the processing power of the devices or fault coverage and system 
throughput will be decreased . Software voting is prone to transient 
errors, but once debugged cannot fail permanently as is the c a s e for 
hardware voters. 
6.3 THE U S E OF REDUNDANCY AND INCREASED COST 
If a FMEA of a system shows that the failure rate of some components 
is unacceptably high, some form of redundancy must be included to mask the 
high failure rate of the individual components. To a first approximation. 
- 76 
for a set of components which is completely protected by fault-tolerant 
circuitry, the failure rate of the protection circuitry must not exceed 
that of the non-redundant components that are to be protected. For example 
the failure rale of the voters in a TMR configuration should not exceed 
that of a single channel of the triplex arrangement. 
A more accurate method of predicting the benefit likely to be gained 
from redundancy is the analysis of chapter four which includes the effect 
of maintenance. The software coverage is calculated according to the 
relative execution times of the different software routines, whilst the 
hardware coverage is calculated according to the ratio of protected to 
unprotected circuitry. The proposed repair or maintenance rate must be 
defined, then the MTTFIF can be calculated for the redundant configuration. 
If the MTTFIF is greater than unity, then the use of redundancy is 
justified. in cost sensitive applications, redundancy should be applied to 
those parts of the system for which the greatest MTTFiFs are obtained. 
The increased cost of redundancy should be considered in relation to 
the total system cost. For a small controller it is likely that the 
majority of the cost is due to the cabinet, power supply, printed circuit 
board and assembly costs. If the equipment is to be used in a hazardous 
environment, the cost of intrinsic safety measures such as barriers is 
substantial. It is likely that redundancy will double the component cost 
of a system and will require a larger printed circuit board and enclosure, 
but when the total system cost is examined the increase will be less than 
double. Typical microprocessors cost under £5 nowadays which makes the 
component costs small in comparison with other costs. 
A fault-tolerant system will greatly increase the design costs both of 
hardware and software. If complex voting and communication routines are 
performed in software, this will greatly increase Ihe software design 
costs. 
- 77 -
The design of a general purpose fault-tolerant controller which 
contains the necessary fault recovery software would allow the increased 
design costs to De spread over many units. Although a redundant controller 
may initially cost more, it should last longer and require less 
maintenance. The cost of integrated circuits is approximately constant 
because of ever increasing advances in technology, but the cost of 
maintenance which is labour intensive is ever increasing. 
78 -
CHAPTER 7 
BACKGROUND TO DESIGN OF GOVERNOR CONTROLLER 
7.1 SPECIF ICATION 
The governor is of a hybrid nature, consisting of an electronic 
microprocessor based controller, which in turn controls a mechanical valve 
which controls the flow of gas through the governor. The two main 
constraints on the design are reliability and cost (a few hundred pounds). 
The governor should cost little more than its pneumatic equivalent and 
should be as reliable. Some Increase in cost and apparent degradation of 
governor reliability can be compensated for by added features and more 
efficient gas network control. The electronics should consume as little 
power as possible because of the problem of battery back-up of the mains 
supply. Whilst the electronics to be described do not prohibit the use of 
battery back-up. a CMOS design would be better from this point of view and 
It should be possible to convert the design of chapter eight to CMOS at the 
expense of computational speed of the controller and increased failure 
rate. Whilst design considerations may require the use of CMOS, it should 
be appreciated that the failure rate of CMOS devices is likely to be 
higher, especial ly under harsh environmental conditions. 
The controller is a single or double loop controller, having a typical 
instruction cycle of 2pS. Depending on the software, the controller will 
implement the following control algorithms :-
(i) Three term controller of outlet pressure. 
(ii) Clock control - The outlet pressure is varied over a p re -
programmed pressure versus time profile so that the governor is 
able to satisfy the changes in demand throughout the day. 
(Ill) Flow control - The flow through the governor iii maintained 
constant cxcepi for high and low pressure overrides. 
79 -
(iv) Demand Activated Governing (DAG) - The outlet pressure of the 
governor is varied according to the flow through the governor. 
Under conditions of low flow the outlet pressure is held at a 
minimum, whilst under conditions of high flow, the outlet 
pressure is increased to compensate for resistive pressure drops 
in the downstream pipework. 
The control algorithm implemented on the Durham governor is (iv) DAG 
s ince this is the most efficient and suitable for a distribution governor. 
7.2 MECHANICAL VALVE 
There are considerable problems associated with interfacing an 
electronic controller to a mechanical valve. Ideally the valve would have 
no moving parts and would control the flow of gas smoothly and linearly 
from O F F to ON by simply applying an analogue voltage ranging from OV to a 
few volts. This is obviously not possible, so some form of 
e lectromechanical control is required. 
There are two types of mechanical valve :-
(i) Direct acting - The position of the valve is controlled solely by 
an electrical signal. Failure of the electrical signal c a u s e s 
the valve to fail fully OPEN or fully SHUT. 
(ii) Pneumatic back-up - A conventional pneumatic valve is modified so 
that its set-point is adjusted by an electrical signal. Failure 
Of the electrical ^ c a u s e s the valve to fail to its maximum or 
minimum pressure and not the catastrophic OPEN or SHUT situation. 
The maximum and minimum are chosen so that the governor will 
still operate within safe limits, but under normal operating 
conditions the pressure can be varied between the two set-points, 
giving efficient control of the network. 
It was decided to opt for a mechanical valve with pneumatic back-up. 
- 80 
British G a s have developed two valves of this type which the electronic 
controller is capable of driving. The first type uses a stepper motor to 
vary the compression of the main valve spring and hence the set-point. 
This is shown in figure(14). The stepper motor drives a gearbox which 
engages with a screwed nylon plug. Hence the plug may be moved up and 
down, changing the compression on the main spring. This design has several 
disadvantages :-
(i) Complex drive circuitry and high power consumption of the stepper 
motor. 
(ii) The stepper motor must be contained within a large and expensive 
flame proof enclosure because of the requirement tor intrinsic 
safety. 
(iii) If the stepper motor or drive signals to it. fail, then the valve 
will stick at some intermediate position. 
The second type of mechanical valve used by British G a s is the one 
chosen for the experimental governor at Durham. This consists of a 
modified pneumatic valve controlled by two solenoid valves By switching 
the solenoids in the required sequence , it is possible to raise the outlet 
pressure , hold it constant, or lower it. The valve is shown in f igured5). 
The valve is a modified Donkin 226 'K' pilot which has an extended lower 
body part containing a dual Bellofram sealed piston with a spring 
positioned between the topside of the piston and the underside df the pilot 
valve. The chamber above the upper Bellofram is connected to the pilot 
outlet pressure . G a s at higher pressure from the inlet to the valve is 
admitted or exhausted from the lower Bellofram chamber through two solenoid 
valves which are controlled by the microprocessor controller. The level of 
pressure in the lower chamber determines the position of the lower spring 
against the pilot valve, and therefore the amount that it deloads the pilot 
main spring. The volume between the two Belloframs is vented to atmosphere 
and serves as a breathing chamber. 
- 81 -
A needle valve is positioned near the inlet/outlet port of the lower 
chamber to provide a means of adjusting the rate of change of loading 
pressure. The rate of change determines both the acciuracy of the set-point 
and the stability of the pilot valve. The governor is arranged to. fail 
safe so that if the solenoids are de-energ ised due to a power failure, then 
the outlet pressure fails to the maximum pressure controlled by the fop 
spring compression. This is arranged by having the inlet pressure solenoid 
normally c losed and the outlet pressure solenoid normally open. The 
compression of the lower spring controls the minimum outlet pressure of the 
pilot valve. Thus the valve will fail safe to a minimum or maximum 
(normally maximum) pressure , should the solenoids fail and not fully OPEN 
or SHUT. The only dangerous failure mode is if both solenoids fail OPEN. 
This provides a bypass round the pilot valve, but flow should be held to a 
safe limit because the saturated flow through two solenoids is small 
compared with the flow through the pilot valve. 
This design overcomes the problem of the stepper motor system and has 
several advantages :-
(i) The solenoids operate at low power, typically 0.25W each which 
would be ideal for a CMOS controller. 
(ii) The solenoids are small , cheap and intrinsically safe. 
(iii) Fault-tolerant drive for the solenoids is easier to implement. 
(iv) If it is necessary to protect against solenoid failure, it is 
both cheap and easy to replace each solenoid with four in a 
ser ies/paral le l configuration. 
The one disadvantage is that the solenoids have a typical life of 
about five million operations and therefore under typical operating 
conditions would need replacing every eighteen months. 
The experimental rig at Durham consists of the electronic controller 
cind the mechanical valve just described. Whilst this is sullicient to 
verify techniques of f low/pressure control, the small pilot valve cannot 
- 82 -
p a s s the v o l u m e of g a s u s e d by s e v e r a l h u n d r e d c o n s u m e r s a s a typica l 
g o v e r n o r stat ion is r e q u i r e d to do. T o p a s s a g r e a t e r vo lume of g a s , the 
pilot va lve c o u l d be u s e d to cont ro l a l a rge ma in valve. T h i s t e c h n i q u e is 
the p n e u m a t i c e q u i v a l e n t of a t r a n s i s t o r Dar l ington pair , wfhere a l a r g e 
c u r r e n t is Cont ro l l ed in the s e c o n d t rans is tor by a s m a l l e r c u r r e n t 
i n j e c t e d into the first t r a n s i s t o r . 
A n e s t i m a t e is n e e d e d for the fa i lure rate of the m e c h a n i c a l va ive 
b e c a u s e of the r e q u i r e m e n t that the m e c h a n i c a l a n d e l e c t r i c a l parts of the 
g o v e r n o r h a v e fa i lure r a t e s vt^hich a r e of the s a m e o r d e r of magni tude . S u c h 
da ta Is very difficult to ob ta in , but Br i t ish G a s give two fai lure r a t e s 
for s m a l l v a l v e s . B l a k e a n d Drev/ [26] give a fa i lure rate of 0 185 f /yr. 
e i the r O P E N or C L O S E D for a n "open at rest" regula tor , whilst Drew (25) 
g i v e s a fa i lure ra te of 0 .19 f /yr for a n I G A 2 0 0 0 regu la tor , a l though the 
repor t s u g g e s t s that this f igure is a bit p e s s i m i s t i c . S o a fa i lure rate 
of 0.2 f /yr s e e m s a g o o d e s t i m a t e for a s m a l l regulator valve. T h i s 
fa i lu re ra te e x c l u d e s the fa i lu re ra te of the m a i n g o v e r n o r va lve , for 
w h i c h it is i m p o s s i b l e to obta in d a t a , but it would be r e a s o n a b l e to a s s u m e 
a n e q u a l fa i lure ra te to the pilot va lve . 
T h e fa i lure ra te of the s o l e n o i d v a l v e s is n e g l e c t e d s i n c e r e g u l a r 
r e p l a c e m e n t s h o u l d r e d u c e their fa i lure rate to a negl ig ib le va lue . if 
th is p r o v e s to b e a p r o b l e m "In the f ie ld", then a r e d u n d a n t 
s e r i e s / p a r a l l e l c o n f i g u r a t i o n of s o l e n o i d s s h o u l d c u r e the p rob lem. 
7.3 N O N F A U L T - T O L E R A N T C O N T R O L L E R 
T h e e l e c t r o n i c c o n t r o l l e r s p e c i f i c a t i o n h a s b e e n def ined in the 
p r e v i o u s two s e c t i o n s . It m u s t h a v e a fa i lure rate of approximate ly 0.4 
f /yr. , i n t e r f a c e to the s o l e n o i d c o n t r o l l e d va lve , a n d implement the D A G 
c o n t r o l a lgor i thm. A first i terat ion d e s i g n is now p r o p o s e d w h i c h is then 
a n a l y s e d by a n F M E A (Fau l t Mode E f fec t A n a l y s i s ) w h i c h highl ights the a r e a s 
- 83 
Of g r e a t e s t fa i lure rate . 
T h e b lock d i a g r a m of the p r o p o s e d cont ro l le r is g iven in f i g u r e ( l 6 ) . 
U s i n g the da ta of M I L - 2 1 7 D [37] a n d a c t u a l m e a s u r e d d e v i c e t e m p e r a t u r e s , 
the followfing fa i lure r a t e s a r e o b t a i n e d for the cont ro l l e r : -
P r e s s u r e t r a n s d u c e r P I ( m a n u f a c t u r e r s fa i lure rate) 10.4 f /mi l l ion h r s . 
P r e s s u r e t r a n s d u c e r P 2 ( m a n u f a c t u r e r s fa i lure rate) 10.4 f /mil l ion h r s . 
A / D c o n v e r t e r 1.93 f /mi l l ion h r s . 
C o n t r o l c i rcu i t ry 2 .05 f /mi l l ion h r s . 
f \ / l i c r o p r o c e s s o r + a d d r e s s l a t c h + buf fers 4 6 0 f /mi l l ion h r s . 
4k s t a t i c RAM 12.3 f /mi l l ion h r s . 
4k EPROfVI 3.0 f /mil l ion h r s . 
Input /output c i rcu i t ry 0.94 f /mi l l ion h r s . 
W a t c h d o g t imer + r e s e t c i r c u i t r y 0.63 f /mil l ion h r s . 
T O T A L 502 f /mil l ion h r s . 
= 4 f/yr 
T h i s total fa i lu re rate i s u n a c c e p t a b l e a n d is too high by a factor of 
t en . T h e m i c r o p r o c e s s o r is c l e a r l y the most unre l iab le part of the s y s t e m , 
follov^^ed by the p r e s s u r e t r a n s d u c e r s a n d the RAM, it is there fore 
n e c e s s a r y to r e d e s i g n the con t ro l l e r u s i n g fau l t - to lerant t e c h n i q u e s to 
m a s k the high fa i lure rate of the c o m p o n e n t s just m e n t i o n e d . 
7.4 M E T H O D S O F I N T R O D U C I N G F A U L T - T O L E R A N C E 
A further d e s i g n r e q u i r e m e n t w a s that the contro l le r c o u l d be 
p r o g r a m m e d by B r i t i s h G a s p e r s o n n e l not hav ing deta i led knowledge of fau i t -
to ierant p r o g r a m m i n g or h a r d w a r e t e c h n i q u e s , a n d that s t a n d a r d sof tware 
s h o u l d run o n the s y s t e m . F o r t h e s e r e a s o n s many of the fau l t - to lerant 
f e a t u r e s of the c o n t r o l l e r a r e t r a n s p a r e n t to the u s e r and the s p e c i a l i s e d 
fault r e c o v e r y so f tware is p r e - p r o g r a m m e d a n d c a l l e d a s s u b r o u t i n e s . 
84 -
Al ternat ive fau l t - to le rant a r c h i t e c t u r e s a r e d i s c u s s e d in c h a p t e r s 
five a n d six It w a s d e c i d e d to u s e t h r e e m i c r o p r o c e s s o r s in a two out of 
t h r e e major i ty voting c o n f i g u r a t i o n , s h a r i n g c o m m o n m e m o r y , to o v e r c o m e the 
h igh fa i lure rate of a s i n g l e m i c r o p r o c e s s o r . T h e voting w a s to be 
p e r f o r m e d in h a r d w a r e w h i c h h a s the a d v a n t a g e that, at the output of the 
v o t e r s , the TMR c o n f i g u r a t i o n jus t " looks" like a s i n g l e m i c r o p r o c e s s o r a s 
long a s the t h r e e m i c r o p r o c e s s o r s run in exac t s y n c h r o n i s m . T h i s a l lows 
s t a n d a r d s o f t w a r e to b e run o n the s y s t e m . T h e voters a l s o a c t a s . a n d 
r e p l a c e , the buf fers b e t w e e n the m i c r o p r o c e s s o r a n d the s y s t e m bus . T h r e e 
a l te rna t ive m e t h o d s w e r e c o n s i d e r e d for pe r fo rming the majority vote : -
(i) T T L log ic - T h e c o m p o n e n t s n e e d e d for a s i n g l e c h a n n e l a r e s h o w n 
in f igure (5 ) . T o d e c r e a s e the n u m b e r of g a t e s r e q u i r e d , wire 
O R e d outputs a r e u s e d . T h e d i s a d v a n t a g e with wire O R i n g is that 
the r i s e t ime of the output is s l o w e r than a d e v i c e with "act ive" 
p u l l - u p . S i n c e it is n e c e s s a r y to vote on thirty c h a n n e l s , this 
would r e q u i r e o n e h u n d r e d a n d fifty g a t e s c o n t a i n e d in thirty 
e ight T T L p a c k a g e s . T h e l a r g e n u m b e r of in tegrated c i r c u i t s 
r e q u i r e d would be u n r e l i a b l e a n d bulky. 
(ii) E P R O M - A four c h a n n e l voter c o u l d be c o n t a i n e d in a 4k E P R O M . 
T h e d i s a d v a n t a g e is that the E P R O M c o u l d not dr ive the s y s t e m b u s 
without fur ther buf fer ing a n d that the 4 5 0 n s a c c e s s t ime of the 
E P R O M would m a k e voting s low a n d r e q u i r e the m i c r o p r o c e s s o r c l o c k 
f r e q u e n c y to b e r e d u c e d , thus r e d u c i n g p r o c e s s o r throughput . 
(iii) F P L A (F ie ld P r o g r a m m a b l e L o g i c Array) - T h e equiva lent voting 
c i rcu i t to ( igure(5 ) is p r o g r a m m e d into a logic ar ray . A five 
c h a n n e l voter c a n be c o n t a i n e d in a F P L A , offering fast voting 
a s wel l a s th ree e r r o r f lags; T h e s y s t e m bus c a n be dr iven 
d i rect ly without further buf fer ing. 
T h e F P L A a p p r o a c h w a s c h o s e n a n d is d e s c r i b e d in more detai l later. 
T h e t h r e e p r o c e s s o r s s h a r e c o m m o n m e m o r y . T o improve both the 
- 85 -
p e r m a n e n t a n d t r a n s i e n t t o l e r a n c e of faul ts , it w a s d e c i d e d to protect the 
m e m o r y u s i n g a S E C / D E D ( S i n g l e bit e r ro r c o r r e c t i o n / d o u b l e bit e r ro r 
d e t e c t i o n ) H a m m i n g c o d e , r e f e r e n c e 177]. L a r g e s c a l e in tegra ted c i r c u i t s 
a r e c o m m e r c i a l l y a v a i l a b l e to i m p l e m e n t this type of protect ion on 16 bit 
wide d a t a , r e f e r e n c e s (59 ,79 .801 , h o w e v e r two per m e m o r y a r e r e q u i r e d . T h e 
d e v i c e s a r e e x p e n s i v e , c o m p l e x , a n d of unknown though probably high fa i lure 
rate . An 8 bit v e r s i o n of t h e s e in tegrated c i r c u i t s is not ava i l ab le , s o 
it w a s d e c i d e d , in the i n t e r e s t s of complex i ty , rel iabil i ty, a n d c o s t , to 
u s e a s i m p l e e n c o d i n g / d e c o d i n g P R O M to i m p l e m e n t S E C / D E D protect ion. T h i s 
is d e s c r i b e d in m o r e deta i l later . 
In o rder to m a k e the p r e s s u r e t r a n s d u c e r s highly re l i ab le , it w a s 
d e c i d e d to u s e t h r e e c h e a p p iezo d e v i c e s in a majority voting 
c o n f i g u r a t i o n , i n s t e a d of u s i n g o n e e x p e n s i v e t r a n s d u c e r . 
A s a f inal p ro tec t ion a g a i n s t t r a n s i e n t or sof tware e r r o r s , it w a s 
d e c i d e d to i n c o r p o r a t e a s y s t e m "watchdog", w h i c h s h o u l d r e s t o r e the 
c o n t r o l l e r to c o r r e c t o p e r a t i o n fol lowing a s y s t e m c r a s h . 
T h e c o n t r o l " f i rmware" will be s t o r e d in two s e p a r a t e E P R O M s e t s , e a c h 
c a p a b l e of e x e c u t i n g the c o n t r o l a lgor i thm. T h i s p rov ides a h a r d w a r e s p a r e 
a s wel l a s a l lowing a so f tware " s p a r e " to be i n c o r p o r a t e d at a later date . 
If the s p a r e E P R O M s e t c o n t a i n s the s a m e a lgor i thm, but c o d e d differently, 
then if a so f tware e r r o r is d e t e c t e d , swi tch ing over to the a l ternat ive 
s o f t w a r e c o d i n g will r e s t o r e c o r r e c t o p e r a t i o n a s long a s the s a m e softwai'e 
e r r o r d o e s not exist in both c o p i e s of " f i rmware". 
T h e output d r i v e r s to the s o l e n o i d v a l v e s w e r e d e s i g n e d to to lerate 
a n y s i n g l e c o m p o n e n t fa i lure b e c a u s e of the unproven reliabil ity of the 
dr iver in tegra ted c i r c u i t a n d the l a r g e c u r r e n t s be ing repeated ly s w i t c h e d . 
- 86 -
C H A P T E R 8 
D E S C R I P T I O N O F E X P E R I M E N T A L C O N T R O L L E R 
8.1 C O N S T R U C T I O N 
T n e con t ro l l e r c o n s i s t s of t h r e e d o u b l e E u r o c a r d " m i c r d b u s " 
prototyping c a r d s h o u s e d in a 6 U high 19 ' Vero rack . T h e r a c k a l s o h o u s e s 
the power s u p p l i e s , twenty c o l u m n pr inter , a n d p u l s e g e n e r a t o r u s e d for 
test ing p u r p o s e s . A p h o t o g r a p h of the con t ro l l e r is g iven in f igure(19) . 
T h e b o a r d s a r e w i r e - w r a p p e d a s is the s y s t e m b u s to the e d g e 
c o n n e c t o r s . E a c h in tegra ted c i rcu i t is d e c o u p l e d by a O . l u F c a p a c i t o r . 
T h e top s i d e of the prototyping b o a r d s c o n s i s t s of a g round p lane w h i c h is 
c o n n e c t e d to OV at s e v e r a l po ints . T h i s r e d u c e s n o i s e on the power supply 
a n d g i v e s a low i m p e d a n c e c o n n e c t i o n be tween the power supply a n d the 
in tegra ted c i r c u i t s . 
T h e p o w e r s u p p l y , w h o s e c i rcu i t d i a g r a m is g iven in a p p e n d i x d ) h a s 
the fol lowing outputs : -
• SV @ 5A overvo l tage a n d shor t c i rcu i t p ro tec ted 
+ 12V § l A s h o r t c i r c u i t p r o t e c t e d 
- 1 2 V e 1A phort c i r c u i t p r o t e c t e d 
+28V/ e l A s h o r t c i r c u i t p r o t e c t e d 
T h e m a i n s s u p p l y to the p o w e r supp ly Is f i l tered to r e d u c e the effect of 
m a i n s b o r n e t r a n s i e n t s . 
T h e s i g n a l s c a r r i e d by the b a c k p l a n e b u s a r e de f ined in tab le (19) . 
T h e layout of the c i r c u i t b o a r d s is g iven in append ix (2 ) . 
8.2 T M R M I C R O P R O C E S S O R B O A R D 
T h e b lock d i a g r a m of the m i c r o p r o c e s s o r board is g iven in f igure(21) . 
Ivarti b lock will be d e s c r i b e d in deta i l with r e f e r e n c e to its c i rcu i t 
d i - j g r a m A two out of th ree majori ty vote is p e r l o r m e d on three 
- 87 -
m i c r o p r o c e s s o r s r u n n i n g in s y n c h r o n i s m , the voting be ing at bus level . T h e 
v o t e r s dr ive the s y s t e m b u s di rect ly . A c o m m o n s e l f - s y n c h r o n i s i n g c l o c k is 
c o n n e c t e d to the p r o c e s s o r s a s well a s s y n c h r o n i s a t i o n h a r d w a r e a n d r e s e t 
c i r c u i t r y / w a t c h d o g t imer. A n R S 2 3 2 in te r face is provided to c o m m u n i c a t e 
with a V D U or pr inter . 
8.2.1 M i c r o p r o c e s s o r b lock 
T h r e e ident ica l c h a n n e l s c o n s i s t i n g of a m i c r o p r o c e s s o r a n d a s s o c i a t e d 
buf fers a r e c o n n e c t e d a s s h o w n in f i g u r e s ( 2 2 . 2 3 , 2 4 ) . C o n s i d e r i n g c h a n n e l 
1. a s s h o w n in f i g u r e ( 2 2 ) , a n 8 0 8 5 m i c r o p r o c e s s o r , U9 , h a s its e ight l eas t 
s i g n i f i c a n t a d d r e s s l i n e s A 0 - A 7 demul t ip lexed by U8 w h i c h a r e then 
c o n n e c t e d to the v o t e r s . A d d r e s s l i n e s A 8 - A 1 5 a r e c o n n e c t e d direct ly to 
the v o t e r s a n d s e v e r a l con t ro l l i nes a s well a s the S O D l ine. 
B i d i r e c t i o n a l da ta buf fers U 1 2 . U 1 3 . s e p a r a t e data into DATA O U T a n d DATA 
IN. T h e D A T A IN c o n n e c t i o n s to the t h r e e c h a n n e l s a r e c o n n e c t e d in 
p a r a l l e l a n d to the s y s t e m b u s . D A T A O U T is c o n n e c t e d to the voters . 
P u l l - u p s a r e c o n n e c t e d to A D 0 - A D 7 a n d the RD a n d WR l i n e s , this is b e c a u s e 
t h e s e l i n e s a r e t r is tated d u r i n g c e r t a i n ins t ruc t ions a n d the p u l l - u p s will 
e n s u r e that all t h r e e c h a n n e l s a r e pul led up to +5V to prevent voting 
e r r o r s d u e to the i n d e t e r m i n a t e na ture of a tr istated l ine. All inputs io 
the m i c r o p r o c e s s o r s with the e x c e p t i o n of C L K , T E S T , a n d HOLD a r e c o n n e c t e d 
in p a r a l l e l , whi lst mos t outputs , a s d e s c r i b e d , a r e voted upon. 
O n the c i r c u i t d i a g r a m s a n u m b e r in b r a c k e t s is g iven after the s i g n a l 
n a m e , this ident i f ies the c h a n n e l f rom o n e to th ree a n d is u s e d on both 
inputs a n d outputs . W h e r e no b r a c k e t e d n u m b e r is g iven , the s i g n a l 
r e f e r r e d to is a n output from the s y s t e m voters or is a s i g n a l on the 
s y s t e m b u s . 
8.2.2 V o t e r s 
T h e voting on ninety s i g n a l s (thirty from e a c h m i c r o p r o c e s s o r ) is 
p e r f o r m e d in six iden t ica l F P L A s a s s h o w n in f lgure (25 ) . E a c h F P L A votes 
- 88 -
upon fifteen s i g n a l s a n d r e d u c e s t h e m to five s i g n a l s . T h e r e m a i n i n g th ree 
outputs f rom the F P L A s a r e u s e d a s e r ro r f l ags , all of w h i c h a r e act ive low 
50 that e r r o r f l ags s u c h a s T+2 a n d 2+3 c a n be wire O R e d . T h e e r ror flag 
1+2 i n d i c a t e s a n e r r o r is p r e s e n t on c h a n n e l 1 or 2 , whilst 2+3 i n d i c a t e s 
a n e r r o r o n c h a n n e l 2 or 3. T h u s by e x a m i n i n g both e r r o r f lags it is 
p o s s i b l e to d e t e r m i n e w h i c h c h a n n e l d o e s not a g r e e with the other two. T h e 
e r r o r f l a g s V I - v e i n d i c a t e in w h i c h voter t h e r e is a voting er ror . T h u s it 
is p o s s i b l e to d e t e r m i n e w h i c h c h a n n e l is in e r r o r a s well a s the na ture of 
the e r r o r , a d d r e s s b u s . da ta b u s , or cont ro l bus . 
D u e to s l ight t iming d i f f e r e n c e s be tween the three p r o c e s s o r s a s wel l 
a s a c c e s s t ime d e l a y in the v o t e r s , the e r ro r f lags do not give a c o n s t a n t 
l o g i c T for no e r r o r a n d l o g i c ' 0 ' for a n e r ro r . I n s t e a d a r a g g e d s i g n a l 
is p r o d u c e d w h i c h is e i the r p redominan t ly log ic ' T or ' 0 ' . F o r this 
r e a s o n the e r r o r f l ags a r e f i l tered by a low p a s s filter c o n s i s t i n g of a Ik 
r e s i s t o r a n d a 3 3 0 p F c a p a c i t o r . T h e o s c i l l o s c o p e t r a c e of f igure(26) s h o w s 
the unf i l tered e r r o r f lag at the top a n d the fi l tered flag below. It c a n 
b e s e e n from the unf i l tered s i g n a l that the r i s e time of the voters is 
c o n s i d e r a b l y l o n g e r than the fall t ime (approx. 5 0 n S ) . T h i s is b e c a u s e 
o p e n c o l l e c t o r outputs a r e u s e d w h i c h h a v e a c t i v e pull d o w n , but p a s s i v e 
p u l l - u p v ia the Ik r e s i s t o r s c o n n e c t e d to all of the outputs. 
T h e c a l c u l a t i o n of the low p a s s filter v a l u e s a n d the p r o g r a m m i n g 
in format ion for the F P L A is g iven in a p p e n d i x O ) . 
8 .2 .3 Se l f s y n c h r o n i s i n g c l o c k 
T h e c l o c k input to the 8 0 8 5 is d iv ided by two internal ly before be ing 
u s e d a s the in te rna l m i c r o p r o c e s s o r c l o c k . T h i s internal c l o c k s i g n a l is 
a v a i l a b l e at the C L K O U T pin. It w a s found that on p o w e r - u p . the divide by 
two s t a g e c o u l d p o w e r up in e i t h e r s ta te . Normal ly this would not mat ter , 
but for t h r e e p r o c e s s o r s w h i c h a r e d e s i r e d to run in s y n c h r o n i s m , this is 
s e r i o u s ? It w a s found that o n e p r o c e s s o r c o u l d p o w e r - u p half an in terna l 
- 89 -
C l o c k c y c l e out of p h a s e with the o ther two. T h i s si tuat ion c o u l d only oe 
r e m e d i e d by s w i t c h i n g off the power a n d trying a g a i n . T h e se l f 
s y n c h r o n i s i n g c l o c k c i r c u i t of f igure (28 ) w a s d e v e l o p e d to o v e r c o m e this 
p r o b l e m . T h e t h r e e C L K O U T s i g n a l s a r e fed to a three to eight l ine 
d e c o d e r . U 2 2 . w h i c h c o n t r o l s the t h r e e A N D g a t e s . U23 . T h e AND g a t e s 
r e m o v e the c l o c k s i g n a l to the e r r a n t p r o c e s s o r , s h o u l d it be out of 
s y n c h r o n i s m with the other two. A c o n v e n t i o n a l 4Mhz quartz con t ro l l ed 
c l o c k is g e n e r a t e d by U 2 0 a n d a s s o c i a t e d c o m p o n e n t s . T h e th ree AND g a t e s , 
U 2 3 . prov ide s o m e d e g r e e of iso la t ion be tween the c l o c k inputs , s h o u l d any 
of them st ick at log ic ' 1 ' or ' 0 ' . 
8.2.4 S y n c h r o n i s a t i o n h a r d w a r e 
Addi t iona l c i rcu i t ry is r e q u i r e d to s y n c h r o n i s e the th ree p r o c e s s o r s 
a n d l a t c h the voter e r r o r f l a g s in the event of a voting e r ror . T h i s 
c i r c u i t r y is s h o w n in f igure (29 ) . T h e s i g n a l s m a r k e d PORTxx(RD) or 
P O R T x x ( W R ) re fer to outputs f rom the port a d d r e s s d e c o d e r s , w h e r e 
P O R T x x ( R D ) is g e n e r a t e d by a INxx inst ruct ion a n d PORTxx(WR) by a n OUTxx 
i n s t r u c t i o n . T h e top half of the c i rcu i t l a t c h e s the e r ro r f lags and the 
bottom half is u s e d to s y n c h r o n i s e the t h r e e p r o c e s s o r s . 
W h e n a voting e r r o r o c c u r s e i the r or both e r ro r f l ags 1+2 a n d 2+3 go 
low, c a u s i n g the output of U 2 3 d pin 11 to go low. A s s u m i n g that a 
P O R T O K W R ) h a s taken p l a c e pr ior to a voting e r r o r , the flip/flop U 2 8 a , 
U 2 8 b will h a v e b e e n r e s e t . T h e f l ip/f lop is se t by a low on p i n l U 2 8 a a n d 
the fa l l ing e d g e on pin 6 of the fl ip/f lop t r iggers the m o n o s t a b i e U18b . 
After a shor t d e l a y , g o v e r n e d by the 3 3 p F / 2 2 k R C network ( 4 4 0 n s ) , the 
m o n o s t a b i e g e n e r a t e s a r i s i n g e d g e w h i c h l a t c h e s the e r ror f lags in U 3 1 , a n 
o c t a l l a t c h . T h e d e l a y is n e c e s s a r y s o that the- f i l tered er ror f lags h a v e 
t ime to set t le b e f o r e be ing l a t c h e d . T h e c a l c u l a t i o n of the n e c e s s a r y 
d e l a y a n d h e n c e R C network Is g iven In a p p e n d i x ( 4 ) . T h e l a t c h e d e r r o r 
f l ags a r e r e a d by a r e a d to P O R T 0 1 . T h e f l ip/f lop U28 p r e v e n t s furthei-
- 90 -
t r a n s i t i o n s on U 2 3 d p i n l 1 ( further voting e r r o r s ) from tr igger ing U 1 8 b a n d 
l a t c h i n g new e r ro r f l ags b e f o r e the old o n e s have b e e n r e a d . A write to 
P O R T 0 1 m u s t be e x e c u t e d to r e s e t the f l ip/f lop w h i c h r e - a r m s the e r ro r f lag 
l a t c h i n g m e c h a n i s m . A two bit t r istate buffer. U30 a l lows the 
i n s t a n t a n e o u s e r r o r f lags 1+2 a n d 2+3 to be r e a d a s P O R T 0 2 . 
S y n c h r o n i s a t i o n of the p r o c e s s o r s is a c h i e v e d by a c o m b i n a t i o n of 
h a r d w a r e a n d so f tware . It is e s s e n t i a l to s y n c h r o n i s e the three p r o g r a m 
c o u n t e r s . . T h i s is h a r d w a r e c o n t r o l l e d a n d c a n only be d o n e by execut ing a 
s i m u l t a n e o u s R E S E T or interrupt . A R E S E T p e r f o r m s a co ld start a n d is 
t h e r e f o r e u n d e s i r a b l e e x c e p t for initial p o w e r - u p of the s y s t e m , w h e n a 
R E S E T c a u s e s a l l t h r e e p r o c e s s o r s to b e s y n c h r o n i s e d , s o interrupt d r iven 
r e s y n c h r o n i s a t i o n w a s c h o s e n . T h e h a r d w a r e interrupt p r o c e s s s y n c h r o n i s e s 
the p r o g r a m c o u n t e r a n d the interrupt so f tware c o m p l e t e s the rest of the 
s y n c h r o n i s a t i o n . It is not suf f ic ient to s imply supply a c o m m o n interrupt 
to all th ree p r o c e s s o r s b e c a u s e a n interrupt is only r e c o g n i s e d at the e n d 
of the c u r r e n t ins t ruc t ion . Two p r o c e s s o r s will be e x e c u t i n g o n e 
i n s t r u c t i o n , whi lst the th i rd , e r r a n t p r o c e s s o r , may be execiuting a n 
ins t ruc t ion of di f ferent length . T h e p r o c e s s o r s will there fore r e c o g n i s e 
a n interrupt at d i f ferent t i m e s a n d s y n c h r o n i s a t i o n c a n n o t be a c h i e v e d . If 
s e v e r a l r e t r i e s a r e m a d e , then it is p o s s i b l e to e x e c u t e a s i m u l t a n e o u s 
interrupt , but o n e further p r o b l e m ex is ts . It w a s found that o n e p r o c e s s o r 
c o u l d lock o n e c l o c k c y c l e b e h i n d the other two. C o n s i d e r i n g the o p - c o d e 
fe tch t iming of f i g u r e d 7 ) , it is s e e n that da ta Is valicj a n d r e a d at T 3 , 
but c o n t i n u e s to be val id at T 4 . H e n c e o n e p r o c e s s o r c a n lock o n e c l o c k 
p e r i o d b e h i n d a n d r e a d a n ins t ruc t ion at T4 w h i c h then b e c o m e s T 3 for that 
p r o c e s s o r . H e n c e the e r r a n t p r o c e s s o r will e x e c u t e the s a m e o p - c o d e s a s 
the o ther two, but o n e c y c l e b e h i n d a n d will read /wr i te i n c o r r e c t da ta in 
t h r e e c l o c k c y c l e r e a d / w r i t e o p e r a t i o n s . It is i m p o s s i b l e to c u r e this 
s i tuat ion by e x e c u t i n g further in ter rupts a n d the er rant p r o c e s s o r will 
r e m a i n l o c k e d o n e c l o c k c y c l e b e h i n d . 
- 91 
In o r d e r to b v e r c o m e t h e s e s y n c h r o n i s a t i o n p r o b l e m s the R S T / H O L D 
c i rcu i t ry w a s d e v e l o p e d . T h e R S T 6 , 5 interrupt is u s e d to initiate 
p r o c e s s o r r e s y n c h r o n i s a t i o n , e x e c u t i n g in twelve c l o c k c y c l e s . T h e R S T 6 . 5 
interrupt is g e n e r a t e d a n d he ld for ten c l o c k c y c l e s , just l e s s time, than 
the interrupt " ins t ruc t ion" t a k e s to e x e c u t e . T h i s g ives a ten c l o c k c y c l e 
"window" dur ing w h i c h t ime a n y of the t h r e e p r o c e s s o r s may r e c o g n i s e the 
interrupt . T h e n a H O L D p u l s e is g e n e r a t e d for a further ten c l o c k c y c l e s . 
Act iva t ion of the H O L D pin on the m i c r o p r o c e s s o r s u s p e n d s all ex terna l 
activity, but a l lows in terna l p r o c e s s i n g to c o n t i n u e . In this way the 
t h r e e p r o c e s s o r s a r e g iven t ime to c a t c h up with e a c h other a n d r e m o v a l of 
the H O L D p u l s e s h o u l d c a u s e them al l to s e t off in s y n c h r o n i s m . T h e 
t iming of the R S T / H O L D p u l s e s is s h o w n in f i g u r e d 8 ) , O n e p r o c e s s o r m a y 
just be f i n i s h i n g a n ins t ruc t ion a n d a n o t h e r may just be start ing a n 
e i g h t e e n d o c k c y c l e i n s t r u c t i o n , w h i c h l e a d s to a worst c a s e s e v e n t e e n 
c l o c k c y c l e d i f f e r e n c e b e t w e e n the two p r o c e s s o r s w h i c h is too wide for the 
ten c l o c k c y c l e "window". In t h e s e c i r c u m s t a n c e s it will be n e c e s s a r y to 
p e r f o r m a retry. In p r a c t i c e it h a s b e e n found that r e s y n c h r o n i s a t i o n is 
a c h i e v e d after a few r e t r i e s at worst . 
T h e R S T / H O L D c i r c u i t r y is s h o w n at the bottom of f igure (29) . T h e c l o c k 
is d iv ided by ten in U 2 7 a to give a f r e q u e n c y of 200kHz w h i c h is a l s o u s e d 
by the A / D c o n v e r t e r . T h e d iv ided c l o c k is fed to the two bit c o u n t e r U 2 7 b 
w h i c h c y c l e s th rough the fol lowing s t a t e s , c h a n g i n g every ten c l o c k c y c l e s . 
0 0 No a c t i o n 
01 R S T 6 . 5 interrupt 
10 H O L D o p e r a t i o n 
i 1 R e s e t f l ip/f lop U21 
T h e two bit c o u n t e r U 2 7 b is normal ly he ld c l e a r e d (output=00) by a 
h igh o n pin 14. W h e n a voting e r ro r c a u s e s U 2 3 d pin 11 to go low, this 
s e t s f l ip/f lop U21 w h i c h is c o n n e c t e d to the c l e a r input of the two bit 
c o u n t e r a n d the c o u n t e r a d v a n c e s through the s t a t e s g iven above . W h e n the 
- 92 -
c o u n t r e a c h e s 11 this is d e t e c t e d by U 2 1 c w h i c h r e s e t s the fl ip/flop U 2 1 . 
T h e f l ip/f lop f o r m e d by U 2 8 c . U 2 8 d is u s e d to e n a b l e / d i s a b l e the R S T / H O L D 
c i rcu i t ry . A low on U 2 8 c pin 11 h o l d s the output of f l ip/flop U21 
p e r m a n e n t l y h i g h , d i s a b l i n g the R S T / H O L D c i rcu i t ry . T h e fl ip/flop is 
d i s a b l e d by a write to PORTOO a n d is e n a b l e d by a write to P O R T 0 2 . 
T h e P O R T 0 7 ( W R ) s e l e c t is c o n n e c t e d to f l ip/f lop U21 s o that it is se t 
by a write to P O R T 0 7 . T h u s it is p o s s i b l e to t r igger the R S T / H O L D 
c i r c u i t r y by a port write ins t ruc t ion a s wel l a s by de tec t ion of a voting 
e r r o r , T h i s is use fu l for tes t ing p u r p o s e s a n d sof tware init iated 
r e s y n c h r o n i s a t i o n . 
T h e port a d d r e s s e s for the r e s y n c h r o n i s a t i o n h a r d w a r e a r e s u m m a r i s e d 
be low : -
P O R T R E A D W R I T E 
0 - D i s a b l e R S T / H O L D c i rcu i t ry 
1 R e a d l a t c h e d e r r o r f l a g s R e s e t f lags la tch fl ip/flop 
2 R e a d i n s t a n t a n e o u s e r r o r f l ags E n a b l e R S T / H O L D c i rcu i t ry 
7 - T r i g g e r R S T / H O L D c i rcu i t ry 
8 ,2 .5 R e s e t c i rcu i t ry a n d w a t c h d o g t i m e r s 
A s wel l a s p o w e r - o n r e s e t ( P O R ) it is n e c e s s a r y to r e s e t the 
c o n t r o l l e r it the +5V log ic s u p p l y d i p s be low a c c e p t a b l e l imits c a u s e d by a 
s h o r t in terrupt ion to the m a i n s supply . It w a s found that most R C 
c o n t r o l l e d P O R ne tworks will not d e t e c t a shor t interrupt ion in the +5V 
s u p p l y , but this interrupt ion w a s suf f ic ient to c a u s e the m i c r o p r o c e s s o r to 
c e a s e r e l i a b l e o p e r a t i o n a n d c r a s h . S u c h a shor t interruption to the m a i n s 
s u p p l y h a s b e e n e x p e r i e n c e d dur ing e l e c t r i c a l s t o r m s w h e n there is a long 
d e l a y be fo re auto r e - c l o s u r e of c i rcu i t b r e a k e r s following a l ightning 
s t r ike to the s u p p l y c a b l e s . 
Two w a t c h d o g t i m e r s a r e i n c l u d e d to r e s e t the s y s t e m in the event of 
it c r a s h i n g Both w a t c h d o g s a r e m o n o s t a b l e s w h i c h a r e t r iggered by a write 
- 93 
to P O R T 0 6 . T h e ins t ruc t ion to re t r igger the w a t c h d o g s ( O U T 0 6 ) o c c u r s o n c e 
only in the cont ro l so f tware a n d is l o c a t e d in the m a i n p r o g r a m loop. T h e 
m o n o s t a b l e s a r e r e t r i g g e r e d at a rate g r e a t e r than their p u l s e width s o 
that a " c o n s t a n t " p u l s e is g e n e r a t e d . H o w e v e r if the p r o g r a m c r a s h e s this 
will c a u s e the re t r igger ing o p e r a t i o n to c e a s e a n d the m o n o s t a b i e will t ime 
out after a d e l a y d e t e r m i n e d by its p u l s e width. A T R A P (non m a s k a b l e ) 
in terrupt is g e n e r a t e d w h e n the f irst w a t c h d o g t i m e s out w h i c h g e n e r a t e s a 
w a r m start . If the w a r m start f a i l s , then the s e c o n d w a t c h d o g will " t ime-
out" after a fur ther de lay a n d a c o l d s y s t e m R E S E T will resu l t . 
T h e r e s e t c i rcu i t ry a n d w a t c h d o g s a r e s h o w n in f igure (30 ) . T h e level 
t r i g g e r e d r e s e t c i rcu i t ry u s e s a " long ta i led pair" c o n t a i n e d in U 2 9 a s a 
c o m p a r a t o r w h i c h c o m p a r e s the r e f e r e n c e vo l tage a c r o s s two s i l i c o n d i o d e s 
to the +5V log ic supp ly . t h e +5V supply is div ided by a potential divider 
b e f o r e c o m p a r i s o n . By s u i t a b l e c h o i c e of the potential d iv ider , the 
c i rcu i t is a r r a n g e d to g ive a log ic ' 0 ' if the power supply fal ls be low 
4.5V a n d will g ive a log ic ' 1 ' o t h e r w i s e , P o w e r - o n - r e s e t is de tec ted a n d 
m a n u a l r e s e t is p rov ided by a p u s h button s w i t c h . 
M o n o s t a b i e U 1 8 a f o r m s the T R A P w a t c h d o g w h i c h is t r iggered by a write 
to P O R T 0 6 . T h e p u l s e width is s e t by the R C network a n d in this c a s e is 
s e t to a p p r o x i m a t e l y two s e c o n d s . W h e n the w a t c h d o g t i m e s - o u t , the Q 
output g o e s f rom a l o g i c ' 0 ' to ' 1 ' , g e n e r a t i n g a T R A P interrupt. 
M o n o s t a b i e U 1 9 a . w h i c h h a s a p u l s e width of foiir s e c o n d s , fo rms the 
R E S E T w a t c h d o g a n d is r e t r i g g e r e d by a write to P O R T 0 6 . W h e n the w a t c h d o g 
t i m e s - o u t it t r i g g e r s the m o n o s t a b i e U 1 9 b , g e n e r a t i n g a R E S E T p u l s e of 
a p p r o x i m a t e l y o n e m i l l i s e c o n d dura t ion . T h e r is ing e d g e of the R E S E T p u l s e 
is u s e d to re t r igger m o n o s t a b i e U 1 9 a in c a s e the r e s e t opera t ion h a s not 
b e e n s u c c e s s f u l . In this way the w a t c h d o g is r e t r i g g e r e d by the R E S E T 
p u l s e a s wel l a s a write to P O R T 0 6 , s o that if the r e s e t is not s u c c e s s f u l , 
t hen the w a t c h d o g will c o n t i n u e to g e n e r a t e R E S E T p u l s e s at a n interval 
g o v e r n e d by U l 9 a . A d iv ide by two f l ip/f lop is c o n n e c t e d to the output of 
- 94 -
the R E S E T w a t c h d o g w h i c h wid "toggle" every t lme the w a t c h d o g t i m e s - o u t . 
T h e output from this f l ip/f lop is u s e d to s e l e c t o n e of two E P R O M s e t s , 
both of w h i c h a r e c a p a b l e of e x e c u t i n g the cont ro l p r o g r a m . T h u s in the 
e v e n t of a s y s t e m c r a s h a w a r m start is t r ied first, then the s p a r e E P R O M 
se t is s w i t c h e d in a s well a s pe r fo rming a c o l d r e s e t . 
8 . 2 . 6 R S 2 3 2 i n t e r f a c e 
T h e s e r i a l input a n d output l i n e s a r e c o n n e c t e d to R S 2 3 2 dr ivers a s 
s h o w n in f igure (28 ) s o that a n R S 2 3 2 d e v i c e c a n c o m m u n i c a t e with the 
c o n t r o l l e r for d e b u g g i n g , e r ro r logg ing , or c o n s o l e input/output p u r p o s e s . 
8.3 M E M O R Y B O A R D 
T h e m e m o r y b o a r d b lock d i a g r a m is g iven in f igure(31) a n d c o n t a i n s 4k 
of s t a t i c R A M w h i c h is H a m m i n g c o d e p r o t e c t e d , two s e t s of 4k E P R O M . 4k 
" instant R O M " a n d a d d r e s s d e c o d e r s for m e m o r y a n d ports. At b u s level the 
b o a r d " looks" like a c o n v e n t i o n a l m e m o r y b o a r d a n d many of the fau l t -
to lerant f e a t u r e s a r e t r a n s p a r e n t to the u s e r , a l though er ror f l ags a n d 
add i t iona l s e l f - t e s t c i rcu i t ry a r e i n c l u d e d to m a k e full u s e of the fau i t -
to ie ran t f e a t u r e s of the b o a r d . 
T h e RAM is p r o t e c t e d by a S E C / D E D H a m m i n g c o d e . T h e n e c e s s a r y c h e c k 
bits a r e g e n e r a t e d by a l o o k - u p tab le in the e n c o d i n g ROM and s to red 
a l o n g s i d e da ta in the R A M . W h e n data is r e a d from the R A M , the data a n d 
c h e c k bits a r e r e a d out a n d d e c o d e d by a R O M , w h i c h c o r r e c t s the data if 
n e c e s s a r y a s wel l a s f lagg ing e r r o r s . 
E i g h t - b i t wide da ta r e q u i r e s five c h e c k bits. T h i s would requ i re a n 
e n c o d i n g R O M hav ing e ight a d d r e s s l i n e s a n d th i r teen data outputs. T h e 
d e c o d i n g R O M would n e e d th i r teen a d d r e s s l i n e s a n d th i r teen da ta outputs if 
e r r o r f l ags a r e i n c l u d e d . Both t h e s e ROM s i z e s a r e very la rge a n d not 
c o m m e r c i a l l y a v a i l a b l e . Two 2k E P R O M s c o u l d be u s e d tor e n c o d i n g and two 
Bk E P R O M s for d e c o d i n g , w h i c h a r e c o m m e r c i a l l y a v a i l a b l e . T h i s a p p r o a c h 
- 95 -
would be no m o r e re l i ab le than the d e s i g n a d o p t e d , but would r e d u c e the 
s t o r a g e r e q u i r e d from s ix teen bits wide to. th i r teen bits wide. T h e 
d i s a d v a n t a g e of u s i n g E P R O M s i n s t e a d of R O M s is their s low a c c e s s t ime 
( 4 5 0 n s ) a s o p p o s e d to 5 0 n s for b ipolar R O M s . T h i s would a d d to the m e m o r y 
r e a d / w r i t e a c c e s s t ime a n d would r e q u i r e the m i c r o p r o c e s s o r to be s l o w e d 
d o w n , thus r e d u c i n g p r o c e s s o r throughput . T h e only a d v a n t a g e of E P R O M s is 
the i r lower power c o n s u m p t i o n than b ipolar R O M s , w h i c h may be important in 
a C M O S d e s i g n , a n d their s l ight ly h i g h e r rel iabil i ty. 
In o r d e r to u s e s m a l l c o m m e r c i a l l y ava i l ab le bipolar R O M s , it w a s 
d e c i d e d to split the e ight bit w ide da ta into two ident ica l four bit w ide 
s t r e a m s w h i c h h a s the a d d e d a d v a n t a g e that e a c h s t r e a m will c o r r e c t a 
s i n g l e bit e r r o r a n d d e t e c t a d o u b l e bit e r r o r . T h u s if s i m u l t a n e o u s 
e r r o r s o c c u r in both s t r e a m s , the protect ion c i rcu i t ry will c o r r e c t doub le 
bit e r r o r s a n d d e t e c t four bits in e r r o r . E a c h s t r e a m h a s its own H a m m i n g 
c o d e p ro tec t ion in the form of four c h e c k bits. Data is s tored a s four 
da ta bits a n d four c h e c k bits a n d is r e a d out a s eight bits, w h i c h a r e 
d e c o d e d to r e s t o r e the four da ta bits a s well a s g e n e r a t i n g four e r ro r 
f l a g s . T h e e n c o d i n g R O M c a n b e c o n t a i n e d in a 32x8 d e v i c e a n d the d e c o d i n g 
R O M in a 256x8 d e v i c e , both of w h i c h a r e s m a l l a n d e a s i l y ava i l ab le . 
T h e R O M l o o k - u p t a b l e s a r e c a l c u l a t e d a s fol lows. T h e S E C / D E D c o d e 
c o n s i s t s of four da ta bits a n d four c h e c k bits o r g a n i s e d a s : -
D 3 D2 D l DO C 3 C 2 C I CO . w h e r e Dx = data bit 
C x = c h e c k bit 
D a t a is s t o r e d a n d r e a d out in this form. On r e a d out the c h e c k bits a r e 
r e - c a l c u l a t e d f rom the data bits a n d t h e s e c h e c k bits a r e e x c l u s i v e O R e d 
with Iho c h o c k bit;; road ou l from Iho HAM I h o ro:iull l;i lour corfociioii 
tiii-, (.;:r c ;^ ' c r (.;o'. 
0 3 ' - C 3 ( r e c a l c u l a t e d ) ® C 3 (s tored) e tc . 
T h e s e c o r r e c t i o n bits identify the posi t ion of a n e r ro r if it is a s i n g l e 
bit e r r o r or i n d i c a t e a d o u b l e bit e r r o r a c c o r d i n g to tab le (20 ) . 
- 96 
L o o k i n g ver t ica l ly up the c o l u m n s a n d writing down a da ta bit w h e r e a ' T 
o c u r r s in t a b i e ( 2 0 ) , it c a n be s e e n that the c h e c k bits a r e c a l c u l a t e d by: -
CO ^ DO ® D1 ® D 2 
01 = DO ® D1 © D3 
C 2 = DO © D2 © D3 
C 3 = 01 © D2 © D 3 
Append ix (5 ) s h o w s the l o o k - u p table c o n t a i n e d in the e n c o d i n g ROM w h i c h is 
c a l c u l a t e d a c c o r d i n g to the e q u a t i o n s a b o v e . 
8 3.1 D e c o d i n g 
T h e d e c o d i n g R O M is a l s o a l o o k - u p table w h i c h c o r r e c t s the data if 
n e c e s s a r y a n d g e n e r a t e s four e r r o r f l ags S 2 S I SO E , w h e r e S 2 S I SO 
identify w h i c h R A M (out of e ight) is in e r r o r a n d E s i g n a l s the o c c u r r e n c e 
of a s i n g l e bit e r r o r . If 6 is not s e t . but SO is s e t , then this i n d i c a t e s 
a d o u b l e bit e r r o r . T h i s is s u m m a r i s e d in tab le (21 ) . T h e e r r o r f l ags S 2 
S I SO E a r e r e f e r r e d to a s the four bit e r r o r s y n d r o m e . 
A p r o g r a m w a s written in a s s e m b l e r to c a l c u l a t e the information to be 
p r o g r a m m e d into the d e c o d i n g R O M . T h e d e c o d i n g R O M l o o k - u p tab le , i s g iven 
in a p p e n d i x ( 6 ) . 
8.3.2 C i r c u i t d e s c r i p t i o n 
T h e c i r c u i t d i a g r a m of the b o a r d is g iven in f igure (32 ) . T h e e n c o d i n g 
R O M s U 3 3 , U 3 4 c o n v e r t the e ight bit da ta to e ight data bits a n d eight c h e c k 
bits w h i c h a r e then s t o r e d in the 4k X 16bit m e m o r y matrix w h i c h is s h o w n 
in m o r e deta i l in f igure (33 ) . It is important to u s e o n e bit wide R A M s s o 
that the fa i lure of a s i n g l e d e v i c e c a u s e s a s i n g l e bit fa i lure w h i c h c a n 
b e c o r r e c t e d . T h e d e c o d i n g R O M s U 3 5 , U 3 6 r e g e n e r a t e the e ight bit d a t a , 
c o r r e c t e d if n e c e s s a r y , f l agg ing a n y e r r o r s that o c c u r . It is p o s s i b l e to 
c o n n e c t a m u c h l a r g e r m e m o r y matrix b e t w e e n the e n c o d e r s / d e c o d e r s w h i c h may 
be d y n a m i c R A M if r e q u i r e d . !n fact c o m m e r c i a l l y ava i l ab le s ix teen bit 
m e m o r y b o a r d s would be i d e a l . T h e data from the d e c o d i n g R O M s is c o n n e c t e d 
- 97 -
to the s y s t e m b u s via the tr istate buffer U38 w h i c h is e n a b l e d by a r e a d 
r e q u e s t to the RAM, T h e e n a b l e is g e n e r a t e d by the OR gate U39 w h i c h is 
c o n n e c t e d to the RAM C S a n d the RD l ine, thus p r o d u c i n g a R A M R D s i g n a l . 
T h e e r r o r f l ags a r e l a t c h e d into U37 in the event of an e r ror a n d c a n be 
r e a d by a r e a d to P O R T O O . T h r e e O R g a t e s U 3 9 , a r e c o n n e c t e d to form a four 
input O R ga te . T h e four inputs a r e c o n n e c t e d to the d e c o d i n g R O M e r r o r 
f l a g s E a n d SO. Any o n e of t h e s e f lags go ing high i n d i c a t e s an er ror h e n c e 
the output of U 3 9 pin 6 is ' 0 ' for no e r ro r a n d ' 1 ' for an er ror . T h i s 
s i n g l e e r ro r flag is s a m p l e d a n d l a t c h e d in U 4 0 a on the r is ing e d g e of 
e v e r y R A M R D . If a 0' is l a t c h e d in , it will h a v e no ef fect , but if a ' 1 ' 
will c a u s e a T to b e l a t c h e d into U 4 0 b . c a u s i n g the Q output to go h igh. 
T h i s g e n e r a t e s a R S T 5 . 5 interrupt a s wel l a s l a tch ing the e r r o r f lags . If 
is left up to the so f tware interrupt rout ine to in ter rogate the e r ror f l ags 
a n d d e t e r m i n e w h e t h e r the e r ro r is a s i n g l e or double bit e r ro r a n d w h i c h 
RAM d e v i c e is in e r ro r . A r e a d to PORTOO r e a d s the e r ror f l ags , a s well a s 
r e s e t t i n g the f l ip/f lop U 4 0 b . Until the fl ip/flop is r e s e t , no further 
e r r o r f l a g s c a n be l a t c h e d into U 3 7 . 
8 .3 .3 A d d r e s s d e c o d i n g a n d " s p a r e " E P R O M s 
A d d r e s s d e c o d e r U41 d e c o d e s e ight 8k p a g e s w h i c h a r e then further 
d iv ided into 2k p a g e s by U 4 2 . T h e " instant ROM" a n d RAM a r e d e c o d e d by 
U 4 2 a . whi lst U 4 2 b d e c o d e s the E P R O M m e m o r y . T h e m o s t s ign i f i can t a d d r e s s 
l ine c o n n e c t e d to U 4 2 b . A 1 2 , is c o n n e c t e d via the e x c l u s i v e O R gate U 4 3 . 
T h i s is s o that w h e n the R O M S E L E C T l ine is at ' 0 ' the A 1 2 l ine is not 
inver ted a n d w h e n at ' T the a d d r e s s l ine is inver ted. With A12 not 
inver ted the a d d r e s s d e c o d i n g is a s s h o w n with the first E P R O M set l o c a t e d 
f rom O O O O - O F F F a n d the s e c o n d from l O O O - l F F F . If the A 1 2 l ine is inver ted , 
then E P R O M set iwo is s e l e c t e d i n s t e a d of se t o n e a n d v i c e - v e r s a . T h e 
"sp^ifo" t P H O M set l o c a t e d from 1 0 0 0 - 1 F F F c a n be r e a d a s n o r m a l m e m o r y , 
h o w e v e r it is iden t i ca l to se t o n e a n d p r o g r a m m e d to run from OOOO-OFFF , 
- 98 
SO will not run in the range 1 0 0 0 - l F F F . If the two sets are swapped by the 
ROM S E L E C T line, then the "spare" will appear to occupy memory in the range 
0 0 0 0 - O F F F and will run and the other set will appear in the range 1000-
I F F F . Thus it is possible for self-testing purposes to read the "spare" 
EPROM set. and in the event of a failure, a separate set can be switched in 
whilst allowing interrogation of the old set to determine the cause of 
failure. 
The "instant ROM" and EPROM circuitry is shown in figure(34). 
"Instant ROM" is essentially battery-backed RAM which can be written to and 
retains data when power is removed. Its pin connections make if pin 
compatible with EPROM. Two "instant ROMs" are installed U45, U46 located 
from 2 0 0 0 - 2 F F F . They are wrlte-protected by the switch shown and are 
buttered by bidirectional buffers U51. U52. Four EPROMs are installed U47. 
U48, U49. U50, located from 0 0 0 0 - 1 F F F and are buffered by U53 which is 
enabled by the B U F F E N signal from the EPROM decoder U42b. 
The input port address decoder is shown at the bottom of figure(34). 
8.3.4 Memory matrix 
The 4k X 16bit memory matrix is shown in more detail in figure(33). 
Memories U55-U70 are 4k X ibit static RAMs and the C S line to each device 
is connected via the OR gates U71-U74. This is for testing purposes only 
and allows RAM devices to be switched out by preventing their cs from going 
low. Dual - in - l ine switches S I . S2 are used to switch out RAMs and pulses 
injected into the control inputs of the OR gates enable the RAMs to be 
transiently switched out. 
8.4 INPUT/OUTPUT BOARD 
The block diagram of the input/output board is given in figure(35). 
The board contains a battery-backed real time clock (RTC). eight bit input 
port, eight bit output port, eight channel A/D converter, eight bit 
- 99 -
solenoid driver, as well as the necessary power electronics to drive a 
stepper motor and solenoid valves. A separate printed circuit board (PCB) 
holds the six pressure transducers as well as their signal conditioning 
circuitry. 
8.4.1 Real time clock 
The real time clock U77 is shown in figure(36). This is Duffered by 
the bidirectional buffer U78. On failure of the +5\/ supply, the clock 
switches over to standby operation and continues to run. A n i -cad battery, 
which is trickle charged under normal operation, provides the necessary 
back-up supply. Open collector AND gates U75 are used to buffer the RD and 
WR signals so that the RTC read and write inputs are isolated from the 
power supply when it is switched off and are pulled up to Vdd. This is 
n e c e s s a r y for correct standby operation. The write strobe to the RTC is 
connected via LK1 which write-protects the RTC vyhen it is removed. This is 
.so that a system c r a s h will not corrupt the RTC. The RTC is decoded as 
sixteen pons from PORTIO to P O R T I F and is programmed and read as described 
in appendix(7) which is an extract from the Radiospares data sheet. 
8.4.2 Multiplexer and A/D converter 
An eight bit analogue multiplexer, U79 is controlled by three address 
lines fvlUXO. MUXl . MUX2 and Is connected to the A/D converter UBO. inputs 
to the multiplexer are low pass filtered with a cut-off frequency of about 
170Hz. Inputs PO to P6 are from the pressure transducers and S I . S2 
monitor the voltage applied to the solenoid valves. The A/D converter is 
clocked by a 200kH2 signal already available which is obtained by dividing 
the processor clock by ten. The two flip/flops, U81 synchronise the start 
conversion signal and the clock. A write to PORT03 starts the conversion 
and a read to PQRT03 enables the tristate output of the converter, allowing 
the digital output to be read onto the system bus. The end of conversion 
(lag is connected to the most significant bit of the eight bit Input port 
- 100 -
U82. The other seven inputs to the port are spare and are connected to a 
dil switch. The input port is read as PORT04. 
The bottom of figure(36) shows the output port address decoder U83. 
8.4.3 Stepper motor driver 
The eight bit output port, U84 which is written to as PORT05 performs 
two functions. The three least significant bits are used to control the 
analogue multiplexer via the address lines MUXO to MUX2. whilst the four 
most significant bits drive four Darlington power transistors. The 
transistors are sufficient to drive the four phases of a stepper motor with 
the common stepper motor connection in this c a s e connected to +5\/. 
Protection diodes are incorporated a s shown. The output port is cleared 
when a R E S E T is executed. 
8.4.4 Solenoid drive circuitry 
The solenoid valve drive circuitry is shown at the bottom of 
iigure(37J. An octal latch and driver. UBS is connected via two resistor 
networks to the solenoid valves. The circuit is designed so that solenoids 
can still be switched on and off in the event of any single component 
falling. The aim of the design is to protect against failure either short 
circuit or open circuit of the solenoid driver transistors. 
The solenoids, although rated at 28\/, were found to pull-in at greater 
than 8.0V and to drop-out at less than 2.6\/. These voltages define the ON 
and O F F limits in the event of component failure in the driver circuitry. 
The drive circuit is basically a potential divider with switched resistors 
forming the lower leg of the divider. By switching in different resistors 
R l to R4 the voltage a c r o s s R5 is varied. This voltage swing is connected 
a c r o s s the solenoid and used to switch it ON and OFF . If all transistors 
are off. then the voltage a c r o s s R5 Is zero, however it one transistor 
fails short circuit, the voltage a c r o s s R5 Is equal to the minimum value. 
If ail four transistors are now switched on, the voltage a c r o s s R5 is 
- 101 -
increased. if one transistor fails open circuit, the voltage developed 
a c r o s s R5 is reduced and equal to the maximum value. The aim of the 
potential divider is to ensure that with one transistor failed open or 
short circuit, the swing from maximum to minimum is sufficient to turn the 
solenoid ON and O F F . The zener diode is necessary to subtract a bias 
voltage from the voltage a c r o s s R5 so that the swing across the solenoid is 
within its ON and O F F limits. Two zener diodes are used in parallel to 
protect against the zener failing open circuit. Any of the resistors, 
transistors, or zener diodes can fail open or short circuit and the driver 
will continue to switch the solenoid ON and OFF as long as it is a single 
component failure. The voltage applied to the bottom end of the solenoid 
is monitored by the A/D converter after being attenuated by the potential 
divider made up of the 22k/1.2k resistors, which attenuate by a factor of 
approximately twenty. By switching the divider transistors and monitoring 
the solenoid voltage, it is possible to diagnose component failure in the 
driver stage. The detailed calculation of values for the resistors, zener 
diodes and supply voltage Is given In appendix(8). 
8.4.5 Pressure t ransducers 
Six "Foxboro" plezo pressure transducers type 1,800 01 C I OOC 0 are 
mounted on a separate printed circuit board. E a c h transducer uses a quad 
operational amplifier connected a s an instrumentation amplifier as weil as 
providing the constant current excitation to the transducers. The six 
t ransducers and their amplifiers are identical except that there is only 
one band-gap reference for all six transducers amplifiers. . 
The circuit diagram for a typical channel is given in figure(38). A 
band-gap reference is connected to U86a which controls the constant current 
of 1.5mA through the transducer. The voltage produced across the 
transducer is amplified by the instrumentation amplifier U86c. U86d which 
has an adjustable gain of^i^^^ji to 300 set by the "span" preset 
102 
potentiometer. A non-inverting amplifier. U86b, having a gain of ten. 
produces a voltage of 0 to 7.5V at Its output. This voltage is derived 
from the band-gap reference and is therefore stable and can be. varied by 
the "zero" preset potentiometer. This variable voltage is fed via a large 
res istance into the inverting input of the final operational amplifier, 
U86d and is used to zero the output. Resistor R6 is individually chosen to 
have a voltage drop of about 1.5V a c r o s s it. This is so that slight 
variations in the 0 to 7.5V supply do not effect the zeroing very much. 
Resistors RI to R5 are supplied with the transducer and are for temperature 
compensation. Resistors R l to R6 are connected as shown to a header plug. 
E a c h transducer and amplifier is adjusted to give a swing of 1.5V for 
a 0 to IS 'w.g . pressure range, and a "zero" of O.IV. 
8.5 PNEUMATIC T E S T RIG 
The test governor was built to demonstrate the reliable, control of gas 
flow and pressure , and to show that the governor will continue to function 
in the presence of errors. Test circuitry allows transient and permanent 
errors to be injected into the controller. A schematic diagram of the 
pneumatic governor is given in figure(39) as well as a photograph in 
figure(20). 
Considering the schemat ic diagram of figure(39), dry compressed air at 
approximately lOpsi enters the " J " governor which reduces the pressure to 
20"w.g. which is approximately held constant at a safe inlet pressure for 
the main control valve. G a s now passes through the modified "K" pilot , as 
(icricribed in cti3pier(7). which Is controlled by the two solenoid valves S I . 
; i fu) I;; : ; i a b i l i G G d hy the noodle valve. It Is approcinied it>ai ii is 
undesirable lo place bends In the pipework close to regulators or pressure 
tappings, but in order to achieve a compact design, it was necessary to 
bend the rig round on itself. This is unimportant for demonstration 
- 103 -
purposes. A needle valve is used as an adjustable "orifice plate' which 
has pressure tappings either side of it. Transducers P I , P2. P3 measure 
the outlet pressure of the "K" pilot and P4. P5. P6 measure the outlet 
pressure of the governor. The pressure differential across the "orifice 
plate" is a measure of the flow through the system according to the 
equation :-
A P = kO^ where k = orifice plate constant 
Q = flow 
= pressure differential 
fvflanomeiers are connected to the pressure tappings to give a visual 
check on the operation of the system as well as being used to calibrate the 
pressure t ransducers. A variable downstream load is simulated by another 
needle valve and gas is finally exhausted to atmosphere through a silencer. 
104 
8.6 ESTIMATED RELIABILITY 
Chapter three concludes that MIL 217D is pessimistic when estimating 
the reliability of integrated circuits, especially plastic. However so 
that the failure rate of the fault tolerant controller can be compared with 
the non fault tolerant controller of section 7.3, an order of merit 
comparison will be performed using MIL 217D data. It is likely that the 
failure rate estimated in this way will be pessimistic, so the failure rate 
calculation is repeated using the failure rate data sources recommended in 
chapter three. 
When analysing the fault tolerant controller several assumptions were 
made : 
(i) Fai lures are instantly repaired. If the MTTR is very much less 
than the MTTF, then this assumption will not produce too large an 
error. For systems where the MTTR is not insignificant, the 
analysis of chapter four should be used. 
(ii) Three pressure transducers per channel are connected in a TMR 
configuration with the voting performed in software. For this 
reason the failure rate of the transducers may be neglected. 
(iii) The three microprocessors are connected in a TMR configuration 
with the voting performed in FPLAs. The failure rate of the 
microprocessors is neglected and only the failure rate of the 
voters is considered. 
(iv) A spare EPROM set is used therefore the failure rate of the 
EPROMS may be neglected. 
(V) The RAM Is protected against single bit failures, so the failure 
rate of the RAM chips is neglected and the failure rate of the 
RAM is calculated as the failure rate ot the extra Hamming code 
circuitry. 
105 
(vi) 
(vii) 
(viii) 
The failure rate of the solenoids is neglected s ince frequent 
replacement should prevent wearout and reduce their failure rate 
to a low value. 
The failure rate of the solenoid drivers is neglected s ince any 
component failure is not fatal and is logged. 
An estimation of the failure rate of the software is not included 
iri the controller failure rate. 
Using MIL 217D and measured integrated circuit c a s e temperatures, the 
following failure rates are obtained for the controller using plastic 
integrated circuits : 
A/O converter . 
control circuitry 
voters 
watchdog 
Hamming code circuitry 
output circuitry 
TOTAL 
1.93 f/million hours 
2.05 f/million hours 
38.5 f/million hours 
0.63 f/million hours 
14.5 f/million hours 
0.94 f/million hours 
58.5 f/million hours 
= 0.5 f/year 
Comparing these figures with those of section 7.3, an improvement in 
the failure rate by nearly an order of magnitude is observed It is seen 
that the Hamming code circuitry is no more reliable than a 4k block of 
unprotected memory. If more reliable encoding/decoding ROMs were used, or 
if the controller were operated at a lower temperature, then the Hamming 
code circuitry would be more reliable than a 4k block of memory. Although 
th(} ot l l i i rnrninq r:odo proloclod rnomory give:; no docrOHije in l.illuro t.ilf; 
when permanoni failures are considered, there is considerable benefit to be 
gained from transient error protection. Castillo et al [781 report 
transient failure rates which are up to fifty times greater than permanent 
- 106 -
faults. The fault tolerant controller offers considerable protection 
against transient errors which is not possible in the non fault tolerant 
design. 
8.6.1 Revised failure rate prediction 
The failure rate of the fault tolerant controller is now predicted 
using the recommended failure rate prediction sources of chapter three. 
Circuit block Data source Failure rate 
A/D converter MIL 217D 1.93 f/M hrs. 
control circuitry CNET 1.83 f/M hrs. 
voters etc. C N E T 18.0 f/M hrs. 
watchdog C N E T 0.62 f/M hrs. 
Hamming circuitry C N E T 7.27 f/M hrs. 
output circuitry C N E T 0.78 f/M hrs. 
address decoders etc. C N E T 5.69 f/M hrs. 
resistors (100) NCSR 0.70 f/M hrs. 
capacitors (100) MIL 217D 0.40 f/M hrs. 
wire wrap connect ions ICL field data 1.60 f/M hrs. 
soldered joints (500) MIL 217D 1.30 f/M hrs. 
edge connectors (200 prs) C N E T 0.60 f/M hrs. 
TOTAL 40.7 f/M hrs. 
= 0.36 f/year 
This failure rate does not include the failure rate of the power 
supply, but is likely to be pessimistic for the rest of the controller 
s ince all failures are assumed to be fatal and do not result in a degraded 
performance. 
The failure rate of 0.36 f/year meets the required value as defined in 
section 7.3. It is therefore proposed that the controller is suitable for 
use in a hybrid pressure controller as described in chapter seven. 
107 -
CHAPTER 9 
GOVERNOR CONTROLLER SOFTWARE 
As a first step towards writing reliable software, modular programming 
was used. The software consists of nineteen modules written in assembler 
which are linked together to form the complete software package which is 
about 2k bytes in length. The detailed software listings are given in 
appendix(lO) and a detailed description of each module follows, together 
with any necessary flow charts. 
9.1.1 MAIN 
Module MAIN is located at address zero and is executed when a reset is 
performed. The module contains the jump vectors for interrupts and 
software restart instructions. The flow chart is given in figure(40). 
Interrupts are disabled at the start of the module so that the 
initialisation process is not interrupted. Before the RAM is used by data 
variables and the stack it Is initialised to F F hex. i.e. all ones. This 
is because of the snake which will be described later. The module INITL is 
then called which initialises all data variables and registers. It is 
important to initialise all registers to the same value, since if this is 
not done, they will contain random data and PUSHIng sucih a register onto 
the stack will result In a voting error if the three microprocessor 
registers contain different random data. The system reset is logged as 
well as the time and EPROM set in use. As described in section 8.3 the 
EPROM set in use is either the main or the spare set. Finally the RST/HOLD 
circuitry is enabled before entering the main control loop. With the 
interrupts disabled the RST/HOLD circuitry will have no effect other than 
to insert hold pulses, but this slows down the processors, so the RST/HOLD 
circuitry is disabled at the start of the module to speed up execution. 
The rest of the module consists of the main control loop. Firstly a 
- 108 -
"recovery block" is formed and stored in RAM which will be used to perform 
vectored recovery if the system c r a s h e s . The interrupts are enabled after 
the formation of the recovery block. This is important on the first pass 
through the control loop because some interrupt routines use vectored 
recovery, and if an interrupt and subsequent vectored recovery is executed 
before the formation and storage of a recovery block, then vectored 
recovery will not be possible and the system will "lock-up". Each pass 
through the loop cal ls the modules CNTRL, P R B U F F . and S L F T S T (to be 
described later) and resets the watchdog timers. 
9.1.2 RESYNC 
A voting error between the three processors generates a RST6,5 
interrupt which c a u s e s a jump to the RESYNC module. This module, whose 
flow chart is given in figure(41). performs the necessary software 
resynchronisation, controls the resynchronisation hardware, and finally 
logs the error. On entry to the module, the RST/HOLD circuitry is disabled 
to speed up execution. If the RST/HOLD circuitry is successfu l , then the 
program counters will be resynchronised. 
All registers are first PUSHed onto the stack and then POPed off. In 
this way a two out of three majority vote is performed on all registers and 
finally all SOD output pins are reset to zero. The retry counter is now 
decremented and tested for zero. If the count is not zero, the current 
state of the error flags is read. If the error flags show no error, then a 
transient error is logged, but if the error still exists, then hardware is 
used to generate a RST6.5 interrupt signal and the processors enter a halt 
state whilst waiting for the interrupt to be recognised. If the retry 
counter re^ictie;; a count of zero, then all retries have been exhausted and a 
Ct ' l l t . i l luK; i;; logged as well as switching out the RST6.5 interrupt and 
RST/HOLD circuitry. ' As well as logging the transient or permanent error, 
the channel in error, syndrome (voter flags Vl-ve) and number of retries 
- 109 -
executed is logged. Finally the retry counter which is stored in RAM is 
reset as well as the syndrome latch flip/flop. Vectored recovery is used 
10 return to the main control loop and is initiated by calling module 
BLOCK. 
The program counters are resynchronised by the. RST/HOLD circuitry and 
by executing a simultaneous interrupt. It is for this reason that retries 
are performed by re-enabl ing the interrupts and then generating a hardware 
interrupt pulse. Thus the retry mechanism is a mixture of hardware and 
software techniques and each retry will attempt to resynchronise the 
program counters which could not be achieved by software alone. 
9.1.3 MERROR 
A memory error generates a RST5.5 interrupt which c a u s e s a jump to the 
MERROR module, whose flow chart is given in figures(42,43). The module 
first of all saves the error syndrome for future logging and tests if it is 
possible to read and write to the RAM. If it is not possible, then 
subroutines can no longer be used because they use the stack which is 
contained in RAM. After a number of retries controlled by the retry 
counter a jump is made to SFAIL which logs the failure of the RAM and 
executes a soft failure which puts the system In a fail safe condition. If 
it is possible to read/write 00 to the RAM. then the ability to read/write 
F F is tested. Testing using both 00 and F F will reveal stuck-at faults. 
The corresponding syndromes for a 00 and a F F read/write are both saved and 
are now compared. If both syndromes are zero and the retry counter has not 
been exhausted, then unless the RST5.5 pin is stuck at '1 ' . the error is 
logged as a transient. If the RST5.5 pin is stuck at ' V . then a hardware 
failure is logged and the RST5.5 interrupt is switched out. if the retry 
count has been exhausted, then It is assumed that the error is permanent. 
The two syndromes are now examined for a double bit error. If a DBE does 
not exist, then if either syndrome is zero or if they are both equal, the 
- 110 -
error is a single bit failure which is logged. if this is not the c a s e , 
then the extra hardware is in error and this is logged. All permanent 
errors are logged and then the RST5.5 interrupt is switched out. If either 
syndrome shows a D B E , then either there is a stuck-at DBE in the check bits 
or the hardware has failed. The DBE must be in the check bits and not the 
data bits s ince the ability to read/write to the RAM has already been 
confirmed. To test for a stuck-at fault in the check bits, a table of test 
data is written to the RAM and then read back. If there is no DBE in the 
data or check bits, then the hardware must be faulty. 
After the type of error has been logged, the syndrome and time are 
logged and, in the c a s e of a single bit error, a normal return from 
interrupt is executed. If the error was a DBE, then the return address 
held in RAM may have been corrupted and vectored recovery is executed 
instead. 
9,1,4 CNTRL 
This module contains the software which controls the process The 
flow chart is given in figure(44). The pressures either side of the 
orifice plate are read, Pgov being upstream of Pout. The pressure 
differential a c r o s s the orifice plate is calculated and the outlet pressure 
of the regulator valve, Pgov. is controlled according to the equation: 
Pgov = Pset + 2Pdelta .where Pset is an arbitary set point 
The desired pressure Pgov and the measured value are then compared. 
If they are equal the solenoids are controlled to hold the pressure, 
otherwise it is determined whether the pressure is too high or too low. If 
the pressure error is greater than the defined deadband. the solenoids are 
contr()llO(f to eiltior lower or raise the pressure as approprlale. Finally a 
del.'jiy is executed to prevent Instability. 
- Ill -
9.1.5 P R E S S R 
This module combines time redundancy with.component_redunda_ncy and the 
flow chart is given In figure(45). Two pressure .• readings are : requii'ed,,, 
Pgov and Pout, which are the pressures e i ther ' side .of • the orifice plate: | \ . 
Pgov is upstream of Pout. A total of six pressure t ransducers ; are read, *• 
arranged as two groups of three. Firstly all six pressure, transducers are / 
read eight times and the average calculated and: stored in ,.a; table. Timepf , 
redundancy such as this will lessen the effect of an- erroneous read or-.:: 
noise on the input. Then a majority vote is performed on each' group .of-
three transducers. The transducers are taken in pairs ' and the three 
different averages are calculated. Then each transducer. , reading is., 
compared with the average of the other two in order to deternriine if one 
channel d isagrees with its related pair. If a channel is found to be: in .-. 
error, the error, and time is logged as long as the retry .count has not been 
exhausted ancji; the pressure reading returned by the module- is equal to the 
average of the two ; "good" channels. if all three channels are in 
agreement, then the average of the three readings is calculated and 
returned by the module. Finally the pressure differential. Pdelta. : across .^ 'r 
the orifice plate is calculated. 
9.1.6 R B K G E N " . . • 
A "recovery block" Is generated. The contents of all registers are. 
saved in a block of memory for later use. 
9.1.7 BLOCK . • • . • 
This module is complementary to .RBKGEN and is used to execute vectored 
recovery, t h e block of memory which contains the cpntents of all registers 
at some time in the past is reloaded Into the registers and execution is 
restarted at the position the last recovery block was generated. ' . •- . 
- 112 -
9.1.8 S L F T S T 
This module is used to test the solenoids and driver circuitry and 
could be expanded to self- test the whole of the system. The self-test 
routine is executed every ten minutes under control of the d o c k and any 
error in the solenoid circuitry is logged, together with the time. A 
solenoid error is defined a s any single component failure which is not 
fatal because of the redundancy already described. The solenoids are only 
tested every ten minutes, s ince in the event of a permanent failure, this 
will c a u s e the failure to be logged every ten minutes until repaired. Any 
more frequently would be a nuisance. This is an alternative technique to 
decrementing a retry counter every time an error is detected and ceasing to 
log the error once the retry count has been exhausted. 
9.1.9 SFAIL 
No RAM is used by this module since some soft failures are caused by 
total failure of the RAM. For this reason no subroutines are used. The 
module logs the type of soft, failure and then puts the system into a fail-
safe condition, which for the experimental controller means failing to the 
maximum pressure. After the fai l -safe shut down, the processor is halted. 
The processor will be interrupted from its halt state by the watchdog timer 
and a system reset will be executed if possible. If not, the shut down 
procedure will be repeated, 
9.1.10 WTRAP 
A non-maskable TRAP interrupt c a u s e s a jump to this module. A TRAP 
interrupt Is generated when the first watchdog times-out and vectored 
recovery is attempted. This module resets the printer, c lears the output 
buffer in c a s e it has been corrupted and then logs the error and time as 
well as the address at which the TRAP interrupt was called. A retry 
counter is decremented each time the module Is called and on every fifth 
cal l , a full system reset is initiated. 
- 113 
This and the following modules log the address at which the error 
ocurred. This may be very useful in detecting software errors. if the 
error always occurs at the same address, then further debugging at that 
address is probably required to solve the problem. 
9.1.11 SNAKE 
This module is called by a software RST7 instruction which is 
conveniently F F . Data lines are pulled high and unused memory is filled 
with F F so that an erroneous jump to non-existent memory or memory 
initialised to F F will c a u s e a RST7 to be executed. In this way erroneous 
jumps are detected and execution may be safely vectored back. The module 
c l e a r s the output buffer in c a s e it has been corrupted and then logs the 
error, time, and address at which the RST7 was called. Finally vectored 
recovery is used to return to the main program. 
9.1.12 DFAULT 
This module may be called to log software errors and is called by a 
RST5 Instruction. If self checking of software is incorporated at a later 
date, then a RST5 Instruction will log the error and execute vectored 
recovery. This module is not called by any others at present, and is used 
for demonstration purposes only. 
9.1.13 INITL 
This is cal led during a system reset and initialises all registers and 
variables which are used later. 
9.1.14 COUT 
A character is converted to serial format and sent to the printer at 
1200 baud. 
- 114 -
9.1 .15 C O U T B F 
T h i s m o d u l e puts a c h a r a c t e r in the output buffer. Output is buf fered 
s o that the cont ro l p r o c e s s is not hal ted whilst printing m e s s a g e s a n d 
logg ing e r r o r s . 
9 .1 .16 P R B U F F 
U n l e s s the output buffer is empty, a c h a r a c t e r is f e tched a n d s e n t to 
the pr inter . 
9 .1 .17 M S G E 
A m e s s a g e w h o s e a d d r e s s is pointed to by the HL reg is ter pair is put 
into the output buffer. 
9 .1 .18 N M O U T 
T h e c o n t e n t s of the A r e g i s t e r is c o n v e r t e d to two ASCI I h e x a d e c i m a l 
c h a r a c t e r s w h i c h a r e put in the output buffer. 
9 .1 .19 T I M L O G 
T h e date a n d t ime is r e a d from the r e a l t ime c l o c k a n d put into the 
output buffer. 
- 115 -
C H A P T E R 10 
T E S T I N G O F F A U L T - T O L E R A N T S Y S T E M S 
10.1 S Y S T E M D E B U G G I N G 
T w o t e c h n i q u e s a r e a v a i l a b l e for d e b u g g i n g m i c r o p r o c e s s o r s y s t e m s : 
10.1.T I n - c i r c u i t e m u l a t i o n 
T h e m i c r o p r o c e s s o r is r e m o v e d a n d r e p l a c e d with the emula t ion pod. 
T h e e m u l a t o r u s u a l l y r u n s in r e a l t ime a n d will there fore s i m u l a t e any 
t iming p r o b l e m s . T h e e m u l a t i o n is con t ro l l ed by the host m i c r o c o m p u t e r 
d e v e l o p m e n t s y s t e m . S o f t w a r e is d e v e l o p e d on the d e v e l o p m e n t s y s t e m , 
s t o r e d o n disl<, a n d then e x e c u t e d on the test s y s t e m by m e a n s of the 
e m u l a t o r . B r e a k p o i n t s m a y be i n s e r t e d into the p r o g r a m , the p r o g r a m flow 
t r a c e d , a n d the c o n t e n t s of r e g i s t e r s e x a m i n e d - E m u l a t i o n is probably 
e a s i e r than log ic a n a l y s i s , but d o e s suf fer from s o m e d i s a d v a n t a g e s w h e n 
tes t ing m u l t i p r o c e s s o r a n d fau l t - to le ran t s y s t e m s . I n - c i r c u i t e m u l a t o r s 
a r e a v a i l a b l e for m u l t i p r o c e s s o r s y s t e m s , but a r e very e x p e n s i v e a n d a r e 
u s u a l l y d e d i c a t e d to only o n e p r o c e s s o r . T h e y c a n n o t be u s e d to in ject 
fau l ts into the s y s t e m , o ther than RAM a n d reg is te r c o r r u p t i o n , a n d 
i n t e r f e r e n c e tes t ing a n d the moni tor ing of r e c o v e r y may c a u s e the e m u l a t o r 
to c r a s h , s i n c e th is f o r m s a n Important part of the s y s t e m u n d e r test . In 
fac t a n e m u l a t i o n pod c a n n o t s i m u l a t e a m i c r o p r o c e s s o r for the p u r p o s e s of 
i n t e r f e r e n c e t e s t i n g . 
10 .1 .2 L o g i c a n a l y s i s 
A l o g i c a n a l y s e r is rea l ly a very fast da ta logger w h i c h will r e c o r d , 
in m e m o r y , the b inary s t a t e s of m a n y para l le l s i g n a l s in rea l t ime. T h e 
in format ion c a p t u r e d In th is way m a y be d i s p l a y e d at l e i s u r e on a C R T 
s c r e e n in a u s e r f r iendly format s u c h a s a p s e u d o t iming d i s p l a y , 
hexadecinf ia l or b inary l ist , or a d i s a s s e m b l e d l ist ing of the c o d e e x e c u t e d 
- 116 
by the m i c r o p r o c e s s o r . T y p i c a l l y s u c h a log ic a n a l y s e r will c a p t u r e 4 8 
c h a n n e l s of in format ion at r a t e s up to 25MHz a n d will s tore I k s a m p l e s , 
in format ion is c l o c k e d into the log ic a n a l y s e r by a n internal or ex terna l 
c l o c k . T h e log ic a n a l y s e r m u s t be t r iggered a n d the t r igger ing posi t ion 
m a y b e v a r i e d within the b lock of da ta c a p t u r e d . T h e t r igger ing s e q u e n c e 
c a n be e i the r the s i n g l e o c ^ r r e n c e of a t r igger word or a c o m p l e x s e q u e n c e 
of t r igger w o r d s . T r i g g e r w o r d s c o r r e s p o n d to the 48 c h a n n e l s of input 
d a t a a s wel l a s extra c l o c k qua l i f i e rs . 
T h e log ic a n a l y s e r is a pure ly p a s s i v e moni tor ing d e v i c e a n d d o e s not 
in te r fe re with the o p e r a t i o n of the s y s t e m u n d e r test . A log ic a n a l y s e r 
w a s u s e d to d e b u g the c o n t r o l l e r d e s c r i b e d in c h a p t e r e ight . S i n c e the 
vot ing i s p e r f o r m e d at b u s l e v e l , the r e p l a c e m e n t of o n e m i c r o p r o c e s s o r by 
a n e m u l a t i o n pod would c a u s e the pod to be ou t -vo ted by the other two 
m i c r o p r o c e s s o r s . F o r th is r e a s o n log ic a n a l y s i s must be u s e d , but a l s o h a s 
s e v e r a l a d v a n t a g e s . It d o e s not in ter fere with the s y s t e m u n d e r test a n d 
c a n moni tor al l t y p e s of fault r e c o v e r y , inc lud ing the e f fec ts of 
i n t e r f e r e n c e tes t ing . 
S i n c e the log ic a n a l y s e r is pure ly p a s s i v e , s o m e method of load ing 
s o f t w a r e into the tes t s y s t e m m u s t b e prov ided . In the s y s t e m of c h a p t e r 
e igh t , a "monitor" w a s written to run on the test s y s t e m w h i c h would 
c o n t r o l the d o w n - l i n e load ing of c o d e Into the test s y s t e m , d isp lay m e m o r y 
d u m p s , a l low m e m o r y l o c a t i o n s to be modi f ied e tc . T e s t sof tware w a s s t o r e d 
in battery b a c k e d RAM w h i c h c o u l d be write p ro tec ted . 
T h e l o g i c a n a l y s e r t iming d i s p l a y w a s found to be very usefu l w h e n 
d e b u g g i n g h a r d w a r e e r r o r s a n d tes t ing the r e s y n c h r o n l s a t i o n h a r d w a r e . 
S o f t w a r e c a n a l s o be t e s t e d with the log ic a n a l y s e r . If the Inputs to 
the s y s t e m u n d e r tes t a r e v a r i e d over their c o m p l e t e r a n g e , it is p o s s i b l e 
to f o r c e the s y s t e m to e x e c u t e al l p o s s i b l e pa ths v ia cond i t iona l jump 
I n s t r u c t i o n s . T h e c o m p l e x t r igger ing capabi l i ty of the log ic a n a l y s e r c a n 
b e u s e d to verify that al l p o s s i b l e , J u m p permuta t ions h a v e b e e n e x e c u t e d . 
- 117 -
10.2 F A U L T I N J E C T I O N A N D R E C O V E R Y 
In o r d e r to test the r e c o v e r y from faul ts , it is n e c e s s a r y to provide 
the m e a n s for in jec t ing fau l ts into the s y s t e m . T e s t c o n n e c t o r s s h o u l d be 
p rov ided that a l low faul ts to be in jec ted into the m i c r o p r o c e s s o r s , R A M , 
a n d r e c o v e r y h a r d w a r e . F a u l t s a r e in jec ted by a p u l s e g e n e r a t o r , a shor t 
p u l s e c o r r e p o n d s to a t r a n s i e n t fault whi lst a long p u l s e c o r r e s p o n d s to a 
p e r m a n e n t fault. T h e p u l s e g e n e r a t o r may be t r iggered manua l ly or by the 
l o g i c a n a l y s e r . 
T h e e x a c t pos i t ion in so f tware of fault in ject ion may be cont ro l l ed 
u s i n g the log ic a n a l y s e r . T h e log ic a n a l y s e r Is p r o g r a m m e d to t r igger on a 
c o m b i n a t i o n of a d d r e s s , d a t a , a n d cont ro l s i g n a l s c o r r e s p o n d i n g to a un ique 
pos i t ion in the so f tware . T h e t r igger output f rom the logic a n a l y s e r is 
u s e d to t r igger the p u l s e g e n e r a t o r , a n d the logic a n a l y s e r c a n then 
m o n i t o r the a c t i o n of the r e c o v e r y rout ine. In this way. the log ic 
a n a l y s e r will c a p t u r e n o r m a l opera t ion fol lowed by fault in ject ion a n d 
r e c o v e r y . T h u s the e f f e c t i v e n e s s of r e c o v e r y rout ines c a n be tes ted for 
fau l ts i n j e c t e d a n y w h e r e in the so f tware . 
R a n d o m faul ts c a u s e d by p o w e r supp ly t r a n s i e n t s may be s i m u l a t e d by 
i n t e r f e r e n c e tes t ing . T h e i n t e r f e r e n c e s imula tor is c o n n e c t e d in s e r i e s 
with the m a i n s c o n n e c t i o n to the power supply . Vol tage d ips a n d t rans ien t 
o v e r v o l t a g e s p i k e s c a n be s i m u l a t e d , a n d r e c o v e r y o b s e r v e d us ing the log ic 
a n a l y s e r . C a r e m u s t be t a k e n that the t r a n s i e n t s do not af fect the c o r r e c t 
o p e r a t i o n of the log ic a n a l y s e r . 
10 .3 O P E R A T I O N A L T E S T I N G 
O n c e the s y s t e m h a s b e e n d e b u g g e d , it s h o u l d be tes ted for all input 
c o n d i t i o n s o v e r its c o m p l e t e o p e r a t i n g t e m p e r a t u r e r a n g e . 
T h e s y s t e m s h o u l d then be b u r h t - i n a t e l e v a t e d t e m p e r a t u r e for s e v e r a l 
h u n d r e d h o u r s a n d t h e n r e l e a s e d for u s e . 
- 118 -
All p e r m a n e n t a n d t r a n s i e n t faul ts s h o u l d be l o g g e d by the s y s t e m , 
p r e f e r a b l y by a pr inter or s o m e other n o n - v o l a t i l e m e a n s , w h i c h will be 
use fu l w h e n d i a g n o s i n g a n d r e p a i r i n g p e r m a n e n t faults. T h e logging of 
t r a n s i e n t faul ts will prov ide a use fu l h istory of the i n c i d e n c e of 
t r a n s i e n t faul ts in that p a r t i c u l a r e n v i r o n m e n t . a n d may requ i re 
p reventa t ive a c t i o n to be t a k e n s u c h a s addi t ional s c r e e n i n g or f i l tering 
of the p o w e r supp ly . 
10.4 T E S T I N G O F T H E G O V E R N O R C O N T R O L L E R 
10.4.1 P r e s s u r e c o n t r o l 
T h e c o r r e c t c o n t r o l of p r e s s u r e w a s first of al l t es ted a c c o r d i n g to 
the re la t ion : 
P g o v = P s e t + 2 P d e l t a 
w h e r e P d e l t a = P g o v - Pout 
P g o v = outlet p r e s s u r e of regu la tor 
Pout = p r e s s u r e d o w n s t r e a m of or i f ice plate 
T h e s y s t e m w a s found to be s t a b l e with the s o l e n o i d n e e d l e valve fully 
o p e n . T h e n e e d l e va lve w h i c h s i m u l a t e s the or i f ice plate w a s set to a 
s u i t a b l e v a l u e a n d t h e n the d o w n s t r e a m load w a s var ied by m e a n s of the 
s e c o n d n e e d l e va lve . A g r a p h Is plotted in f igure(4^) of the regulator 
outlet p r e s s u r e a g a i n s t the p r e s s u r e dif ferential a c r o s s the "ori f ice 
p la te" , P d e l t a . T h e o b s e r v e d v a l u e s of Pgov a r e c o m p a r e d with the 
t h e o r e t i c a l r e s u l t s h o w n by the s t ra ight l ine. A g r e e m e n t be tween 
t h e o r e t i c a l a n d m e a s u r e d v a l u e s is very c l o s e a n d s i n c e the s y s t e m w a s a l s o 
found to be s t a b l e , it is c o n c l u d e d that p r e s s u r e is c o r r e c t l y a n d 
a c c u r a t e l y c o n t r o l l e d . 
- 119 
10.4 .2 F a u l t i n j e c t i o n 
F a u l t s w e r e i n j e c t e d into the c o n t r o l l e r by a p u l s e g e n e r a t o r w h i c h 
w a s e i t h e r t r i g g e r e d m a n u a l l y by a p u s h button, o r by the log ic a n a l y s e r . 
T h e p u l s e g e n e r a t o r h a d two o p e n c o l l e c t o r outputs w h i c h p r o d u c e d negat ive 
a n d posi t ive p u l s e s . F a u l t r e c o v e r y w a s t r a c e d u s i n g the log ic a n a l y s e r a s 
wel l a s o b s e r v i n g that p r e s s u r e cont ro l w a s c o r r e c t a n d that the c o r r e c t 
e r r o r m e s s a g e w a s pr inted out. T h e c o m p l e t e se t of e r ro r m e s s a g e s a s 
p r o d u c e d by the pr inter is g iven in f igure (47 ) . 
10 .4 .3 Vot ing e r r o r s 
A long a n d a s h o r t p u l s e w a s i n j e c t e d into the RDY l ine of c h a n n e l o n e 
m i c r o p r o c e s s o r a s s h o w n in f igure (28 ) . T h i s c a u s e d that c h a n n e l to l o s e 
s y n c h r o n i s a t i o n a n d m e s s a g e 1 w a s pr inted. T h i s m e s s a g e r e p o r t s a 
t r a n s i e n t e r r o r in c h a n n e l 1 a n d o n e at tempt (retry) w a s found to 
r e s y n c h r o n l s e the t h r e e p r o c e s s o r s . T h e log ic a n a l y s e r t iming t r a c e of 
f igure (27 ) s h o w s the t iming of the r e s y n c h r o n i s a t i o n h a r d w a r e . Hal fway 
th rough the fault in jec t ion p u l s e , the 1+2 e r r o r f lag g o e s low a n d at the 
s a m e t ime m i c r o p r o c e s s o r 1 is s e e n to be out of s y n c h r o n i s m with the other 
two a s s h o w n by the A L E p u l s e s . T h e posit ive e d g e of the la tch p u l s e , 
l a t c h e s the e r r o r s y n d r o m e w h i c h s h o w s that the e r r o r is in c h a n n e l 1. T h e 
s y s t e m c l o c k at 2 M H z is s h o w n in the c e n t r e for c o m p a r i s o n . A shor t t ime 
af ter f l agg ing the e r r o r , the R S T / H O L D s e q u e n c e is g e n e r a t e d . F i rst ly a 
R S T 6 . 5 p u l s e is h e l d for ten c l o c k c y c l e s a n d then the H O L D p u l s e for ten 
c l o c k c y c l e s , af ter w h i c h the R S T / H O L D c i rcu i t r e s e t s . T h e R S T / H O L D 
s e q u e n c e is s e e n to be s u c c e s s f u l s i n c e all A L E p u l s e s a r e in s y n c h r o n i s m 
a f t e r w a r d s . A long p u l s e c a u s e d e r r o r m e s s a g e 2 to be pr inted. Attempts 
to r e s y n c h r o n i s e after 2 5 5 r e t r i e s w e r e a b a n d o n e d a n d a C P U fa i lure of 
c h a n n e l 1 Is l o g g e d . In both c a s e s v e c t o r e d r e c o v e r y r e s t o r e d c o r r e c t 
o p e r a t i o n of the s y s t e m . 
- 120 
10.4 .4 R A H e r r o r s 
T h e p u l s e g e n e r a t o r w a s c o n n e c t e d to the RAM test c i rcu i t ry of 
f i g u r e ( 3 3 ) . A s h o r t p u l s e in jec ted Into a s i n g l e RAM c h i p c a u s e d e r ro r 
m e s s a g e 3 to be pr in ted . T h e p u l s e w a s shor t s o a t rans ien t e r ro r w a s 
l o g g e d a n d the s y n d r o m e S = 1 0 c o r r e c t l y identif ied the RAM c h i p into w h i c h 
the e r r o r h a d b e e n i n j e c t e d . 
A long p u l s e i n j e c t e d a s a b o v e c a u s e d m e s s a g e 4 to be printed. T h e 
p u l s e w a s long w h i c h c a u s e d the retry c o u n t to be e x h a u s t e d . T h e s y s t e m 
t h e r e f o r e a s s u m e d that the I ^ M c h i p h a d fa i led pe rmanent ly . 
A long p u l s e w a s i n j e c t e d into the p r e s e t pin of U 4 0 b a s s h o w n in 
f i g u r e ( 3 2 ) . T h i s c a u s e d a R S T 5 . 5 m e m o r y e r r o r interrupt to be g e n e r a t e d 
without a n e r r o r ac tua l l y ex is t ing . T h e e r r o r w a s c o r r e c t l y logged a s a 
h a r d w a r e e r r o r a s s h o w n by m e s s a g e 5. 
A long p u l s e w a s i n j e c t e d Into two RAM c h i p s w h i c h s tored c h e c k bits. 
M e s s a g e 6 w a s pr inted w h i c h s h o w s that t h e r e is a s t u c k - a t doub le bit e r r o r 
in the c h e c k bits. 
A long p u l s e w a s I n j e c t e d Into two RAM c h i p s w h i c h s to red da ta bits. 
It w a s found that it w a s not p o s s i b l e to r e a d a n d write to RAM s o m e s s a g e 7 
w a s pr inted a s wel l a s e x e c u t i n g a soft fa i lure . T h e regula tor w a s found 
to fail s a f e to its m a x i m u m v a l u e . 
10 .4 .5 W a t c h d o g s 
T h e c o n t r o l l e r w a s de l ibe ra te ly c r a s h e d a n d the first w a t c h d o g , the 
T R A P w a t c h d o g , w a s found to r e s t o r e c o r r e c t opera t ion via v e c t o r e d 
r e c o v e r y . T h i s e r r o r c a u s e d m e s s a g e 8 to be pr inted. T h e contro l le r w a s 
d e l i b e r a t e l y c r a s h e d five t i m e s , a n d on the fifth o c c a s i o n a full s y s t e m 
r e s e t w a s e x e c u t e d a s l o g g e d by m e s s a g e 9, init iated by the s e c o n d 
w a t c h d o g t imer . T h e t rap w a t c h d o g rout ine is p r o g r a m m e d to give a full 
s y s t e m r e s e t af ter four a t tempts . A full s y s t e m r e s e t w a s a l s o o b s e r v e d to 
o c c u r at p o w e r . - o n - r e s e t a n d after a s h o r t power supply interrupt ion w h i c h 
- 121 -
w a s c o r r e c t l y d e t e c t e d by the u n d e r v o i t a g e de tec tor . 
T h e s y s t e m w a s c r a s h e d until a s e c o n d full s y s t e m r e s e t w a s p e r f o r m e d , 
but th is t ime m e s s a g e 10 w a s pr in ted. T h i s di f fers from m e s s a g e 9 in that 
E P R O M s e t 01 is l o g g e d i n s t e a d of E P R O M se t 00. T h i s c o n f i r m s that the 
s p a r e E P R O M se t w a s c o r r e c t l y s w i t c h e d in by the R E S E T w a t c h d o g . 
10.4 .6 S n a k e 
A h a r d w a r e R S T 7 . 5 w a s g e n e r a t e d . T h i s c a u s e d a jump to 0 0 3 C w h i c h 
c o n t a i n s F F w h i c h is the c o d e for a R S T 7 ins t ruct ion . A R S T 7 w a s e x e c u t e d 
a n d m e s s a g e 11 r e c o r d s o p e r a t i o n of the s n a k e a n d v e c t o r e d r e c o v e r y . T h e 
a d d r e s s i s g i v e n a s 0 0 3 D a n d not 0 0 3 C b e c a u s e the p r o g r a m c o u n t e r is 
i n c r e m e n t e d b e f o r e b e i n g p u s h e d onto the s t a c k . 
10 .4 .7 S o l e n o i d fa i lu re 
A s o l e n o i d dr iver t r a n s i s t o r w a s s h o r t e d out. s imu la t ing a shor t 
c i r c u i t fa i lure . T h e s o l e n o i d fa i lure w a s c o r r e c t l y logged a s s h o w n by 
m e s s a g e 12. T h e fa i lu re w a s l o g g e d o n c e only a n d w a s at 30 minu tes pas t 
the hour w h i c h c o r r e s p o n d s to a c a l l of the s e l f - t e s t routine at ten minute 
i n t e r v a l s . 
10 .4 .8 P r e s s u r e t r a n s d u c e r s 
T h e p l a s t i c p ipe f e e d i n g p r e s s u r e t r a n s d u c e r 3 w a s bent d o u b l e , 
cu t t ing off the t r a n s d u c e r . M e s s a g e 13 s h o w s that the e r ro r w a s c o r r e c t l y 
l o g g e d a s c h a n n e l 3. 
10 .4 .9 I n t e r f e r e n c e tes t ing 
T h e s w i t c h i n g o n a n d off of e q u i p m e n t n e a r the cont ro l le r c a u s e d 
s e v e r a l t r a n s i e n t e r r o r s to be l o g g e d . Interrupt ions to the power supply 
w e r e d e t e c t e d by the u n d e r v o i t a g e d e t e c t o r a n d c a u s e d a full s y s t e m r e s e t . 
T r a n s i e n t s p i k e tes t ing w a s not p e r f o r m e d s i n c e this w a s felt to be 
m o r e a test of the p o w e r s u p p l y than the cont ro l le r . T h e power supply is 
fitted with " c r o w b a r " overvo l tage protect ion w h i c h p r e v e n t s s p i k e test ing 
- 122 -
s i n c e the c r o w b a r would be t r i g g e r e d . A l s o any fa i lure of the power supply 
c o u l d d e s t r o y m a n y in tegra ted c i r c u i t s w h i c h would be t ime c o n s u m i n g a n d 
e x p e n s i v e to r e p l a c e in the e x p e r i m e n t a l cont ro l le r . If it is r e q u i r e d to 
p e r f o r m s p i k e tes t ing on the c o n t r o l l e r , it is s u g g e s t e d that a d u p l i c a t e 
c o n t r o l l e r is built w h i c h u s e s a p r o f e s s i o n a l g r a d e power supply . 
It i s p r o p o s e d that interrupt ion test ing is a m o r e use fu l test s i n c e 
s h o r t in te r rupt ions to the m a i n s supply h a v e b e e n o b s e r v e d dur ing a n 
e l e c t r i c a l s t o r m , c a u s e d by ' a u t o - r e c l o s u r e of c i rcu i t b r e a k e r s . S o m e 
m i c r o p r o c e s s o r e q u i p m e n t h a s b e e n found to fai l interrupt ion test ing w h e n 
the in te r rupt ions a r e very shor t (a few m a i n s c y c l e s ) . 
- 123 -
D I S C U S S I O N 
T h e rev iew of f au l t - to le ran t c o n t r o l l e r s in c h a p t e r two h a s s h o w n that 
m o s t of the work e l s e w h e r e h a s b e e n c o n c e r n e d with l a rge c o m p u t e r s a n d that 
m u c h of the f a u l t - t o l e r a n c e is i m p l e m e n t e d in so f tware , a l though many of 
the t e c h n i q u e s u s e d on l a r g e c o m p u t e r s a r e a p p l i c a b l e to s m a l l c o n t r o l l e r s . 
T h e low c o s t of s m a l l digital c o n t r o l l e r s m a k e s the d e s i g n of a re l i ab le 
c o n t r o l l e r u s i n g r e d u n d a n c y t e c h n i q u e s m u c h m o r e c o s t s e n s i t i v e . It h a s 
b e e n s h o w n that built in testabi l i ty c a n be i n c l u d e d at little extra c o s t . 
T h e u s e of r e d u n d a n c y will i n c r e a s e h a r d w a r e c o s t s a s well a s s igni f icant ly 
i n c r e a s i n g d e s i g n c o s t s , but t h e s e c o s t s s h o u l d be offset by i n c r e a s e d 
rel iabi l i ty a n d avai labi l i ty . 
In o r d e r to c o m p a r e d e s i g n s , or to m e e t c e r t a i n d e s i g n c r i t e r i a , it is 
n e c e s s a r y to p r e d i c t the fa i lure rate of c o m p o n e n t s . T h e fa i lure of. 
c o m p o n e n t s is a s s u m e d to fit the c o n s t a n t portion of the "bath- tub" c u r v e . 
T h i s a s s u m e s c o m p l e t i o n of the initial b u r n - i n p h a s e a n d that c o m p o n e n t s do 
not w e a r out. T h e w e a r o u t of In tegrated c i r c u i t s Is rare ly e x p e r i e n c e d a n d 
is n o r m a l l y c a u s e d by m o i s t u r e c o r r o s i o n w h e n the d e v i c e is o p e r a t e d in a 
h u m i d e n v i r o n m e n t . In o r d e r to a c c e l e r a t e the fa i lure of in tegra ted 
c i r c u i t s for l i f e - t e s t i n g p u r p o s e s or to extrapolate fa i lure rate data to 
d i f ferent t e m p e r a t u r e s , the A r r h e n i u s a c c e l e r a t i o n equat ion is u s e d . T h e 
fa i lu re ra te of d e v i c e s is s h o w n to i n c r e a s e exponent ia l ly with 
t e m p e r a t u r e - L i f e - t e s t da ta is a n a l y s e d us ing the C h i - s q u a r e d 
d is t r ibu t ion , hovyever c a r e m u s t be e x e r c i s e d w h e n extrapolat ing l i fe - tes t 
d a t a . F o r r a n d o m f a i l u r e s it is p e r m i s s i b l e to multiply the n u m b e r of 
d e v i c e s u n d e r test by the test dura t ion . T h i s s h o u l d not be d o n e for n o n -
r a n d o m f a i l u r e s a n d e s p e c i a l l y w h e n f a i l u r e s a r e d u e to wearout . T h e 
fa i lu re ra te of in tegra ted c i r c u i t s is s h o w n to domina te the overa l l 
fa i lu re ra te of digital c o n t r o l l e r s a n d a h u g e var iat ion is o b s e r v e d in 
fa i lu re ra te p r e d i c t i o n s for in tegra ted c i r c u i t s . T h i s var iat ion is partly 
- 124 -
d u e to the u s e of d i f ferent a c t i v a t i o n ' e n e r g i e s a n d junct ion t e m p e r a t u r e s 
w h e n c a l c u l a t i n g the i n c r e a s e in fa i lure rate d u e to t e m p e r a t u r e . it is 
important to u s e a r e a l i s t i c act ivat ion e n e r g y a n d to u s e the typical power 
d i s s i p a t i o n w h e n c a l c u l a t i n g the junct ion t e m p e r a t u r e . T h e p r e d i c t i o n s of 
M I L - 2 1 7 C a r e s h o w n to be m u c h too high by about two o r d e r s of magn i tude . 
T h e p r e d i c t i o n s of M I L - 2 1 7 D a n d the N C S R data bank a r e m o r e r e a s o n a b l e , but 
it is p r o p o s e d h e r e that the C N E T data is bes t w h e n predic t ing the fa i lure 
r a t e s of indust r ia l g r o u n d b a s e d e q u i p m e n t . 
R e s e a r c h e l s e w h e r e [ 4 3 . 4 8 ] h a s d e m o n s t r a t e d that C M O S is l e s s re l i ab le 
than T T L a n d s h o u l d only be u s e d w h e n it is e s s e n t i a l to u s e d e v i c e s with 
low power c o n s u m p t i o n . 
T h e r e is m u c h p u b l i s h e d Information s u g g e s t i n g that p las t ic 
e n c a p s u l a t e d d e v i c e s s h o u l d be u s e d with cau t ion s i n c e they have a h igher 
fa i lu re ra te than c e r a m i c d e v i c e s . New p l a s t i c s have improved their 
re l iabi l i ty , but c e r a m i c e n c a p s u l a t i o n is still better u n d e r c o n d i t i o n s of 
h igh humidi ty , s i n c e m o i s t u r e i n g r e s s l o n is p revented w h i c h c a n c a u s e 
c o r r o s i o n a n d w e a r o u t of the in tegra ted c i rcu i t . T h e r e is c o n s i d e r a b l e 
e v i d e n c e from o n e m a n u f a c t u r e r [ 3 9 ] that u n d e r favourab le c o n d i t i o n s of low 
t e m p e r a t u r e a n d humidi ty , t h e r e is no d i f f e r e n c e be tween the fa i lure r a t e s 
of p l a s t i c a n d c e r a m i c d e v i c e s . S i n c e the g a s governor cont ro l le r will be 
u s e d in a n u n p r o t e c t e d a n d p o s s i b l y humid env i ronment , it is r e c o m m e n d e d 
that c e r a m i c d e v i c e s a r e u s e d throughout . 
All e q u i p m e n t s h o u l d u n d e r g o b u r n - i n pr ior to r e l e a s e in order to 
e x p o s e w e a k c o m p o n e n t s . It Is r e c o m m e n d e d that in tegrated c i r c u i t s a r e 
p u r c h a s e d a c c o r d i n g to B S 9 4 0 0 g r a d e C a n d a r e burn t - in after insta l la t ion 
in the e q u i p m e n t . After s u c c e s s f u l b u r n - i n , the g r a d e C d e v i c e s may be 
c o n s i d e r e d e q u i v a l e n t to the h i g h e r g r a d e B d e v i c e s w h i c h a r e p red ic ted to 
h a v e a fa i lu re ra te o n e tenth that of p l a s t i c d e v i c e s . 
Al l c o m p o n e n t s s h o u l d be d e r a t e d a s r e c o m m e n d e d h e r e a n d the u s e of 
f o r c e d c o o l i n g s h o u l d be c o n s i d e r e d b e c a u s e of the exponent ia l i n c r e a s e in 
- 125 
fa i lu re ra te with t e m p e r a t u r e . 
T r a n s i e n t e r r o r r a t e s fifty t i m e s t h o s e of p e r m a n e n t fa i lu res h a v e 
b e e n r e p o r t e d d u e to e n v i r o n m e n t a l , pattern sensi t iv i ty , a n d a l p h a 
rad ia t ion e f f e c t s . It is important to protect e q u i p m e n t a g a i n s t t r a n s i e n t 
e r r o r s for w h i c h so f tware f a u l t - t o l e r a n c e is idea l . T h e g a s g o v e r n o r 
c o n t r o l l e r will to le ra te m o s t c l a s s e s of t r a n s i e n t fault. 
T h e ef fect of m a i n t e n a n c e a n d repa i r on the reliabil ity of r e d u n d a n t 
s y s t e m s h a s b e e n s h o w n to a c h i e v e l a r g e i m p r o v e m e n t s in the rel iabil i ty of 
e q u i p m e n t , but is d e p e n d e n t on hav ing a h igh fault c o v e r a g e . 
W h e n d e s i g n i n g fau l t - to le ran t s y s t e m s , it is usefu l to adopt a 
s t r u c t u r e d a p p r o a c h with de f ined l e v e l s of fault r e c o v e r y . In this way 
pro tec t ion will be a f forded a g a i n s t most c l a s s e s of fault a n d the h i g h e r 
l e v e l s of f a u l t - t o l e r a n c e s h o u l d t rap e r r o r s w h i c h a r e not c o r r e c t e d at the 
lower l e v e l s . T h e u s e of F M E C A a n d F M E A is usefu l for identifying the mos t 
u n r e l i a b l e p a r t s of a s y s t e m w h i c h r e q u i r e further d e s i g n effort. 
T h e c h o i c e of m i c r o p r o c e s s o r is irnportant w h e n d e s i g n i n g a cont ro l le r 
a n d the s i z e of the cont ro l task s h o u l d be m a t c h e d to the power of the 
m i c r o p r o c e s s o r , s i n c e the fa i lu re r a t e s of m i c r o p r o c e s s o r s i n c r e a s e with 
complex i ty . T h e c h o i c e of m i c r o p r o c e s s o r may in f luence whe ther fau l t -
t o l e r a n c e is i m p l e m e n t e d in h a r d w a r e or sof tware . Most of the fau l t -
t o l e r a n c e in the g o v e r n o r c o n t r o l l e r Is t r a n s p a r e n t a n d i m p l e m e n t e d in 
h a r d w a r e , s i n c e th is a l l o w s s t a n d a r d so f tware to be run on the cont ro l le r . 
T h e e l e c t r o m e c h a n i c a l g a s g o v e r n o r is a n e x a m p l e of a highly re l i ab le 
digi ta l c o n t r o l l e r . T h e m e c h a n i c a l part of the governor is a s t a n d a r d 
p i e c e of e q u i p m e n t d e v e l o p e d by Br i t ish G a s a n d h a s the a d v a n t a g e of be ing 
s i m p l e to c o n t r o l , c h e a p . In t r ins ica l ly s a f e , a n d will fail s a f e to 
p n e u m a t i c c o n t r o l in the e v e n t of c o n t r o l l e r fa i lure . T h e cont ro l l e r u s e s 
m a n y of the t e c h n i q u e s d i s c u s s e d in this T h e s i s . Difficulty w a s 
e x p e r i e n c e d in s y n c h r o n i s i n g the t h r e e p r o c e s s o r s w h i c h w a s c u r e d by the 
d e v e l o p m e n t of s p e c i a l c i rcu i t ry . In o r d e r for the TMR s y s t e m to be m o r e 
- 126 -
r e l i a b l e than a s i m p l e x s y s t e m , the voters m u s t be m o r e re l i ab le than a 
s i n g l e m i c r o p r o c e s s o r . T h i s c o u l d not have b e e n iachieved if the voters 
w e r e c o n s t r u c t e d f rom T T L . but w a s a c h i e v e d by voting in F P L A s w h i c h a r e 
m o r e r e l i a b l e a n d c o m p a c t than T T L voters . T h e voters w e r e found to be 
fast e n o u g h to permi t o p e r a t i o n of the m i c r o p r o c e s s o r s at a n internal s p e e d 
of 3MH2. T h e v o t e r s w e r e c h o s e n to have o p e n c o l l e c t o r outputs s i n c e this 
a l l o w e d wi re O R i n g of the outputs a n d g a v e g r e a t e r flexibility dur ing 
d e v e l o p m e n t . It would be better to u s e v o t e r s with "totem pole" outputs 
a n d to pe r fo rm the r e q u i r e d O R i n g of the outputs u s i n g externa l T T L g a t e s . 
P u l l - u p r e s i s t o r s would no l o n g e r be r e q u i r e d on the outputs a n d the r i s e 
t ime of the outputs would be improved d u e to the ac t ive p u l l - u p . T h i s 
s h o u l d e n a b l e the v o t e r s to work with fas te r p r o c e s s o r s . T h e rel iabil i ty 
of the T M R c o n f i g u r a t i o n c o u l d be further improved if lower power vo te rs 
w e r e m a d e a v a i l a b l e . T h e c u r r e n t l y u s e d F P L A s a r e c o n s t r u c t e d in b ipolar 
l o g i c w h i c h d i s s i p a t e s c o n s i d e r a b l e power , resu l t ing in a high fa i lure 
ra te . T h e H a m m i n g c o d e p r o t e c t e d m e m o r y of fers no i m p r o v e m e n t in the 
fa i lu re ra te for p e r m a n e n t f a i l u r e s . T h i s is b e c a u s e the fa i lure rate of 
the pro tec t ion c i rcu i t ry is e q u a l to that of a n unpro tec ted 4k b lock of 
m e m o r y . A n I m p r o v e m e n t in the fa i lure rate c o u l d be a c h i e v e d if lower 
p o w e r e n c o d i n g a n d d e c o d i n g R O M s w e r e a v a i l a b l e , having lower fa i lure 
r a t e s . If the m e m o r y s i z e w e r e i n c r e a s e d a b o v e 4k. then the p e r m a n e n t 
fa i lure ra te would be i m p r o v e d . T h e m a i n benefit to be g a i n e d f rom 
pro tec t ing the m e m o r y is the t o l e r a n c e of t r a n s i e n t e r r o r s w h i c h have b e e n 
s h o w n to o c c u r up to fifty t i m e s m o r e f requent ly than p e r m a n e n t f a i lu res . 
T h e w a t c h d o g r e s e t c i rcu i t ry w a s found to be very ef fect ive in reset t ing 
the c o n t r o l l e r af ter it h a d c r a s h e d , a n d c o u l d be i n c l u d e d at little extra 
c o s t on e v e n the m o s t c o s t - s e n s i t i v e n o n - r e d u n d a n t c o n t r o l l e r s . T h e p iezo 
p r e s s u r e t r a n s d u c e r a m p l i f i e r s w e r e found to drift a n d the pr inted c i rcu i t 
b o a r d w a s found to be very s e n s i t i v e to s u r f a c e mois ture . T h e p r o b l e m of 
m o i s t u r e w a s c u r e d by c o a t i n g the u n d e r s i d e of the board with v a r n i s h . T h e 
- 127 -
vot ing on the p r e s s u r e t r a n s d u c e r s m a d e this drift l e s s c r i t i c a l , but it 
would be a d v i s a b l e to r e p l a c e the ampl i f i e rs with p u r p o s e built t r a n s d u c e r 
a m p l i f i e r s . T h e writ ing a n d d e b u g g i n g of the cont ro l le r sof tware w a s m a d e 
m u c h e a s i e r by m a k i n g most of the f a u l t - t o l e r a n c e t r a n s p a r e n t a n d 
i m p l e m e n t e d in h a r d w a r e . T h e log ic a n a l y s e r w a s found to be a powerful 
tool w h e n d e b u g g i n g a n d test ing the cont ro l le r . T h e log ic a n a l y s e r t iming 
d i s p l a y w a s i n v a l u a b l e w h e n d e b u g g i n g the h a r d w a r e , a s w a s the log ic 
a n a l y s e r d i s a s s e m b l e r w h e n t r a c i n g sof tware e x e c u t i o n . It would have b e e n 
i m p o s s i b l e to d e v e l o p this c o n t r o l l e r without the u s e of a powerful log ic 
a n a l y s e r . 
T h e u s e of I C E a n d log ic a n a l y s i s in the test ing of redundant s y s t e m s 
is d i s c u s s e d a n d it Is c o n c l u d e d that log ic a n a l y s i s is m o r e su i tab le for 
the tes t ing of T M R s y s t e m s . S i n c e the log ic a n a l y s e r d o e s not in ter fere 
with the s y s t e m u n d e r tes t , it is su i tab le for i n t e r f e r e n c e test ing a n d the 
moni to r ing of c e r t a i n c l a s s e s of t r ans ien t fault. It w a s n e c e s s a r y to 
i n c l u d e test c i rcu i t ry in the con t ro l l e r s o that the different types of 
fault r e c o v e r y c o u l d be t e s t e d . A shor t p u l s e In jec ted into the s y s t e m w a s 
u s e d to s i m u l a t e a t r a n s i e n t fault , whi lst a long p u l s e w a s u s e d to 
s i m u l a t e a p e r m a n e n t fa i lure . A usefu l test facil ity w a s c o n s t r u c t e d by 
u s i n g the log ic a n a l y s e r to t r igger the p u l s e g e n e r a t o r u s e d for fault 
i n j e c t i o n . In th is way the log ic a n a l y s e r will monitor the opera t ion of 
the c o n t r o l l e r b e f o r e the fault , dur ing the fault, a n d dur ing fault 
r e c o v e r y . 
T h e logg ing of t r a n s i e n t e r r o r s by the cont ro l le r , dur ing n o r m a l 
o p e r a t i o n , will p rov ide m u c h use fu l informat ion about the nature a n d 
f r e q u e n c y of t r a n s i e n t e r r o r s a s they a f fect the cont ro l le r . 
F u r t h e r work is s u g g e s t e d in the fol lowing a r e a s . T h e cont ro l l e r 
s h o u l d be rebui l t on pr in ted c i r c u i t b o a r d s a n d u s e p r o f e s s i o n a l g r a d e 
p o w e r s u p p l i e s of p r o v e n rel iabil i ty. T h e cont ro l le r s h o u l d then u n d e r g o 
i n t e r f e r e n c e a n d e n v i r o n m e n t a l t es t ing . Whi lst the r e c o v e r y sof tware h a s 
- 128 -
b e e n ex tens ive ly d e v e l o p e d a n d t e s t e d , the c o m p l e t e sof tware p a c k a g e c o u l d 
be e x p a n d e d to i n c l u d e m o r e s e l f - c h e c k i n g a n d except ion hand l ing . T h e r e is 
a r e q u i r e m e n t within Br i t i sh G a s for a s im i l a r fau l t - to lerant cont ro l l e r , 
hav ing a m u c h lower power c o n s u m p t i o n . If the i n c r e a s e d fa i lure rate of 
C M O S w e r e a c c e p t a b l e , then It s h o u l d be p o s s i b l e to c o n v e r t the con t ro l l e r 
d e s i g n p r e s e n t e d h e r e to C M O S . 
- 129 -
L I S T O F R E F E R E N C E S 
1 C L U L E Y J . C . - E l e c t r o n i c E q u i p m e n t Rel iabi l i ty - M a c m i l l a n 1974 
2 FANTIN I F , LOLL I M, M I C H E L E T T I G - Microtau : A highly re l i ab le 
c o n t r o l unit b a s e d on a c o m m e r c i a l l y ava i l ab le m i c r o p r o c e s s o r - P r o c . 
E u r o c o n ' 82 C o p e n h a g e n - P u b . N.Hol land p 3 7 5 - 3 8 0 
3 H A R B E R T F . C . - High integrity safety s y s t e m s for o f fshore p lat forms (A 
t r ip l i ca ted m i c r o p r o c e s s o r s y s t e m for f i r e / g a s a n d a u t o m a t i c s h u t - d o w n 
o p e r a t i o n s ) - P r o c . E u r o c o n ' 82 C o p e n h a g e n - Pub . N.Hol land p 4 3 3 - 4 3 7 
4 LIM B . C . B - A fail s a f e m i c r o p r o c e s s o r cont ro l le r for a life suppor t 
s y s t e m - P r o c . E u r o c o n ' B 2 C o p e n h a g e n - Pub . N.Hol land p 4 5 1 - 4 5 5 
5 B L A C K C . J . S U N D B E R G C . E . W A L K E R W . K . S - D e v e l o p m e n t of a s p a c e b o u r n e 
m e m o r y with a s i n g l e e r r o r a n d e r a s u r e c o r r e c t i o n s c h e m e - P r o c . 7th 
C o n f . on F a u l t T o l e r a n t C o m p u t i n g 1977 p 5 0 - 5 5 
6 T O S C H I E . A , W A T A N A B E T - A s e m i c o n d u c t o r m e m o r y with fault d e t e c t i o n , 
c o r r e c t i o n a n d logg ing - H . P . J o u r n a l p 8 - 1 3 
7 P L A T T E T E R D.G - T r a n s p a r e n t protect ion of un tes tab le LSI m i c r o p r o c e s -
s o r s - F a u l t to le rant C o m p u t i n g S y m p o s i u m Tokoyo Oct . 1980 p 3 4 5 - 3 4 7 
8 A N D E R S O N T . K E R R R - R e c o v e r y b l o c k s in ac t ion : A s y s t e m suppor t ing 
h igh rel iabi l i ty - In ternat iona l Conf . on Sof tware E n g i n e e r i n g . S a n 
F r a n c i s c o . Oc t . 1976 
9 R A N D E L L B - S y s t e m s t r u c t u r e for sof tware fault t o l e r a n c e - I E E E 
T r a n s , on S o f t w a r e E n g i n e e r i n g - Vo l . S E T No.2 J u n e 1975 p 2 2 0 - 2 3 2 
10 C E R U S , C O P P A D O R O G , Morgant i M - U D E T 7 1 1 6 : C o m m o n cont ro l for P C M 
t e l e p h o n e e x c h a n g e : D i a g n o s t i c sof tware d e s i g n and avai labi l i ty e v a l -
uat ion - P r o c . A n n . Int. Conf . F - T C o m p u t i n g , T o u l o u s e F r a n c e . J u n e 
1978 p i 6 - 2 3 
11 G H A N I N, . H E R O N K. L E E P.A - A r e c o v e r y c a c h e for the P D P 11 - I E E E 
T r a n s , on C o m p u t e r s - Vol . C - 2 9 No.6 J u n e 1980 p 5 4 6 - 5 4 9 
- 130 -
12 MORALEE D - Super-reliability computer keeps space shuttle safe -
Crosstalk Electronics and Power May 1981 p359-361 
13 G E L D E R L O O S H.C, WILSON D.V - Redundancy management of shuttle flight 
control s e n s o r s - Proc. 1976 I E E E Conf. on Descision and Control 
P462-475 
14 AHERN D.B. LAMONT G.B, P E T E R S O N J .B - Microprocessor development for 
a digital flight control system voter/monitor - I E E E Proc National 
Aerospace Electronics Conf., Dayton Ohio May 1976 p454-462 
15 D E E T S D.A, SZALAl K.J - Design and flight experience with a digital 
fly by wire control system using Apollo guidance system hardware on a 
F8 aircraft - Integrity in Electronic Flight Control Systems, Pub. 
AGARD Neuilly sur Seine France April 1977 p21.1-21.30 
16 F O R S Y T H E W. MARSHALL W.G - Sel f -checking multiprocessor module for 
train control applications - Electronics Letters. 11th June 1981 
Vol.17 No.12 P408-410 
17 F O O S E R - Module minimises repair time of process control systems -
Electronics 2nd March 1978 p121-124 
18 SWARZ R.S - Reliability and maintainability enhancements for the VAX-
11/780 - Proc Annual Int Conf F - T Computing Toulouse France . June 1978 
p24-28 
19 CANEPA M, CLARK S , SIEWIOREK D - C.vmp : The architecture and 
implementation of a fault-tolerant multiprocessor - Proc. 7th Conf. on 
Fault Tolerant Computing 1977 p37-43 
20 RYLAND H.A - Microprocessor systems for railway signalling 
applications - l E E Colloquium on Microprocessor Applications requiring 
High Integrity and Fault - tolerance 11th Oct.1982 
21 DAVIES O.J. O B A C - R O D A V - Aspects of fault-tolerant ring structures -
l E E Colloquium on Microprocessor Applications requiring High integrity 
and Fault - tolerance l l t h Oct. 1982 
131 -
22 HIGUCHI T. KAMEYAMA M - Design of dependent-failure-tolerant 
microcomputer system using triple-modular redundancy - I E E E Trans, on 
Computers Vol.29 1980 p202-205 
* 23 MURPHY D.T - Demand Activated Governing - British G a s internal report 
- E R S Governor Seminar March 1979 
4 24 PICKERING E.W. SPEARMAN C.A, TANNER M.W.G - Leakage control in the 
natural gas era - British G a s internal report E R S GC175 1970 
^ 25 DREW W.A - Review of equipment fault data collection exercise -
British G a s internal report E R S R.954 
^ 26 BI_AKE J.D, DREW W.A - Reliability models of pressure reduction 
stations - British G a s internal report E R S R.850 
27 P A S C O E W - 2107A/2107B N-channel silicon gate MOS 4k RAMs - Intel 
reliability report R R - 7 1975 
28 EUZENT B - Intel 2116 N-channel silicon gate 16k dynamic RAM - Intel 
reliability report R R - 1 6 1977 
29 R O S E N B E R G S - Intel 2716 16k UV erasable PROM - Intel reliability 
report R R - 1 9 1979 
30 P A S C O E W - Polysilicon fuse bipolar PROMs - Intel reliability report 
R R - 8 1975 
31 P A S C O E W - MOS static RAMs - Intel reliability report R R - 9 1975 
32 NICHOLS N - 8080/8080A microcomputer - Intel reliability report RR-10 
1976 
33 INTEL - Component Data Catalog 1980 
34 INSPEC - Electronic reliability data : A guide to selected components 
- Pub. l E E London 1981 
35 REYNOLDS F.H - Measuring and modelling integrated circuit failure 
rates - Proc. Eurocon'82 Copenhagen - Pub. N.Holland p32-45 
36 Military Standard Handbook MiL-HDBK-217C 1979 Dept.Defense USA 
37 Military Standard Handbook MIL-HDBK-217D 1982 Dept.Defense USA 
38 C N E T - Recueil de Donnees de Fiabilite" Edition 1980 
132 -
39 MOTOROLA - 1982 Microprocessor Family Reliability - Reliability report 
8238 Sept. 1982 
40 MOTOROLA - 1982 Memory and micro Reliability - Reliability report 
NO.83/N001 
41 MOTOROLA - MC68000G microprocessor - Reliability report 8243 Oct. 1982 
42 MOTOROLA - Microcomputer components 1979 
43 P E T E R S O N P.W - The performance of plastic encapsulated CMOS 
microcircuits in a humid environment - I E E E Trans, on Components. 
Hybrids and Manufacturing Technology Vol.CHMT-2 No.4 1979 p422-427 
44 FEDERIC I F. MAMMUCARI F, TURCONI G - Influence of plastic encapsulated 
IC on T L C equipment reliability performance - Proc. Eurocon'82 
Copenhagen - Pub. N.Holland p259-264 
45 KLEIN M.R - Microcircuit device reliability : Memory/LSI data - RADC 
report MDR-13 1979 
46 DANIELS B.K, HUMPHREYS M - How do electronic system failure rate 
predictions compare with field experience - Proc. Eurocon'82 
Copenhagen - Pub. N.Holland p935-944 
47 CLARIDGE A.N - A comparison between predicted and observed reliability 
in a large instrumentation and protection system - Proc. Eurocon'82 
Copenhagen - Pub. N.Holland p421-426 
48 JOHNSON G.M, STITCH M - Microcircuit accelerated testing reveals life 
limiting failure modes - 15th Annual Proc. Reliability Physics Las 
Vegas USA April 1977 p179-195 
49 O'CONNOR P.D.T - Practical Reliability Engineering - Pub. Heyden 1981 
50 HAKIM E.B. REICH B - Can plastic semiconductor devices and micro-
circuits be used in military equipment - Proc. 1974 Annual Reliability 
and Maintainability Conf. Los Angeles USA Jan 1974 p396-402 
51 HAKIM E .B . REICH B - Environmental factors governing field reliability 
of plastic transistors and integrated circuits - Proc. 1972 
Reliability Physics Symposium 
- 133 -
52 P E C K D.S, ZIERDT C - Temperature-humidity acceleration of metal e lect -
rolysis failure in semiconductor devices - Proc. 1973 Reliability 
Physics Symposium 
53 LAWSON R.W - The qualification approval of plastic encapsulated 
components for use in moist environments - Proc. on Plastic 
Encapsulated Devices R S R E May 1976 
54 LYCOUDES N - The reliability of plastic microcircuits in moist envir-
onments - Solid State Technology Oct. 1978 p53-62 
55 SMITH D.J - Reliability and Maintainability In Perspective - Pub. 
Macmillan 
56 J E N S E N F. P E T E R S E N N.E - Burn in : An engineering approach to the 
design and analysis of burn-in procedures - Pub. Wiley 1982 
57 J E N S E N F - Reliability Tutorial - Presented at Eurocon'82 Copenhagen 
58 NATIONAL SEMICONDUCTORS - The Reliability Handbook Vol.1 2nd Edition 
1982 
59 NATIONAL SEMICONDUCTORS - DP8400 E C Expandable Error Checker and 
Corrector Oct. 1981 
60 ANDERSON T, L E E P.A - Fault Tolerance : Principles and Practice - Pub. 
Prentice Hall 1981 
61 MOTOROLA - 64k Dynamic RAM Reliability Design Manual 1981 
62 HAYES J.P - Testing memories for s ingle-cel l pattern-sensitive faults 
- I E E E Trans, on Computers Vol .C-29 No.3 1980 p249-254 
63 REDDY S.M. SUK D.S - Test patterns for a c lass Of pattern-sensitive 
faults in semiconductor random a c c e s s memories - I E E E Trans, on 
Computers Vol .C-29 No.6 1980 p219-226 
64 ABRAHAM J.A, THATTE S.M - Testing Of semiconductor random. a c c e s s 
memories - Proc Annual Int. Conf. F - T Computing Los Angeles California 
June 1977 p81-87 
65 BRODSKY M - Hardening RAMs against soft errors - Electronics April 24 
1980 pi 17-122 
- 134 -
66 BRAUER J . B . K A P F E R V.C, TAMBURRINO A.L - Can plastic encapsulated 
microcircuits provide reliability with economy - Microelectronics (GB) 
Vol.1 1970 p5-24 
67 FOX M.J - A comparison of the performance of plastic and ceramic 
encapsulat ions based on evaluation of CMOS integrated circuits -
Microelectronics and Reliability Vol.16 1977 p251-254 
68 KNIGHT L. LUCAS P - Observed failure rates of electronic components in 
computer systems - Microelectronics and Reliability Vol.15 1976 
P239-243 
69 DUMMER G.W.A - Electronic Components : Past Present and Future -
Electronic Components 1970 
70 GISSING J .G - BS9000 components and reliability quality factors : 
suggested use of MIL -HDBK-217C factors based on a comparative product 
a s s u r a n c e analysis - Microelectronics and Reliability Vol.21 1981 
p683-697 
71 DHILLON B.S. SINGH C - Engineering Reliability : New Techniques and 
Applications - Pub. Wiley 1981 
72 ARNOLD T-F - The concept Of coverage and its effect on the reliability 
model of a repairable system - I E E E Trans, on Computers Vol.C-22 No.3 
1973 p251-254 
73 PEARSON J . C . P R E E C E C - Hardware and software aspects of fault 
coverage in small digital controllers - Proc. Eurocon'82 Copenhagen -
Pub. N.Holland p663-668 
74 HALSE R.G. PEARSON J . C . P R E E C E C - The introduction of fault tolerance 
into digitally controlled gas regulators - Proc. 1983 international 
G a s Research Conference London 
75 HUNGER A - Characterisation test of microprocessors using a new 
approach in self-testing - Proc. Eurocon'82 Copenhagen - Pub.N.Holland 
p906-910 
- 135 -
76 DANIELS S . F . FASANG P.P - Microbit i : A microcomputer vtith built in 
diagnostics - Proc. Eurocon'82 Copenhagen - Pub. N.Holland p625-629 
77 HAMMING R.W - Error detecting and error correcting codes - Bell 
Systems Technical Journal Vol.29 1960 p l47 -160 
78 CASTILLO X. McCONNEL S.R, SIEWIOREK D.P - Derivation and calibration 
of a transient error reliability model - I E E E Trans, on Computers 
Vol.C-31 No.7 1982 p658-671 
79 TEXAS INSTRUMENTS - Error detection and correction using SN54/74LS360 
or SN54/74LS361 - Bulletin CA-201 
80 ADVANCED MICRO DEVICES - Am2960 Cascadable 16 bit error detection and 
correction unit 
81 PREDICTOR - Computer software package marketed by Management S c i e n c e s 
Inc. - 6022 Constitution Ave, Albuquerque N.M. 87110 
- 136 -
COLOUR CODED SPRING 
TWO-WAY 
VENT VALVE 
DIE CAST 
kJUL^ 
VALVE TRAVEL 
INDICATOR 
Jf ROLLING TYPE BALANCING 
DIAPHRAGM 
ALUMINIUM 
CA5ING5 
BALANCE TUBE 
STAINLESS STEEL 
ORIFICE 
NIIRILE VALVE 
SEAT 
VALVE BODY 
CAST IRON 
C R O S S SECTION OF A nONKIN FIG.Zfin REGULATOR 
FIGURE 1 
137 -
GOVERNOR 
FIGURE 2 Governor and d i s t r i b u t i o n network 
worst case 
pressure 
working 
stream 
18' 
standby 
stream 
20" 
14" 
14" w.g. 
= SLAM SHUT 
rr REGULATOR 
VALVE 
FIGURE 3 Twin stream Governor 
- 138 
i 
w 
H 
< 
w 
D 
•J 
M < 
I n f a n t 
m o r t a l i t y 
(D 
Constant f a i l u r e r a t e wear-out 
TIME^ 
FIGURE 4 The bath-tub curve 
> 
> 
A.C 
A.B 
+5V 
I k 
A.C + A.B + B.C 
2/3 ma j o r i t y vote 
,+5V 
I k 
ERROR 
FIGURE 5 TMR v o t e r and error d e t e c t i o n i n open c o l l e c t o r TTL l o g i c 
FIGURE 6 TMR Ring s t r u c t u r e 
- 139 -
3000 r 
1000 
300 
100 
I 
•H -p 
0.3' 20 40 
_ J 1 
60 80 100 120 
Operating temperature T l /°C 
140 160 
FIGURE 7 Grafh of acceleratiai factor vs qperating tenperature for a reference tenperature 
' = = ' ^ of 25°C. Plotted for different activatioi aiergies. 
- 140 -
10 000 
3000 
1000 
tn 300 
<D 100 
10 20 30 40 50 
Tamb. /°C 
60 70 
FIGURE 8 Failure rate of a 8085 Microprocessor vs case ambiait tenperature 
141 -
Wafer fabrication 
Screming level A Screening level B 
Internal visual 
ccnditicn A 
I Screaiing level C 
Internal visual 
condition B 
Bake 24 hrs. 
at 150°C 
Screening level D 
Internal visual 
ccnditiCTi B 
Bake 24 hrs. 
at 150°C 
Bake 24 hrs. 
at 150°C 
Tenperature cycle 
10 cycles 
-65°C to 150°C 
Temperature cycle 
10 cycles 
-65°C to 150°C 
Taiperature cycle 
10 cycles 
-65°C to 150°C 
rfechanical shock 
5 pulses at 
1500 g's Yl axis 
CoTstant acceleration 
30 kG's Y2 axis 
then Y l axis 
Constant acceleraticn| 
30 kG's Yl axis 
Constant acceleration 
30 kG's Yl axis 
Fine leak Fine leak 
Gross leak 
Fine leak 
Gross leak 
25°C dc electricals 
Gross leak 
25°C dc electricals 25°C dc electricals 
Bum-in 
240 hrs. at 125°C 
Bum-in 
160 hrs. at 125°C 
25°C dc electricals 
HTRB 72 hrs. 
at 150°C 
{vbere ^ p l i c a b l e ) 
25°C dc electricals 25°C dc electricals 
X Ray 
25°C dc elecb-icals 
Bum-in 
160 hrs. at 125°C 
25°C dc electricals 
GroLp A B C and D 
testing 
FIGURE 9 BS9400 Screening Procedure - see r e f e r e n c e (58) 
- 142 -
1.0 
>,0.6 
< S 0 . 4 
10 
FIGURE 11 
1.5 
FIGURE 10 Gr^h of Sinplex and TMR rel i a b i l i t y vs normalised mission time 
100 1000 10 000 
A2/A . 
Graph of MTTFIF vs r a t i o Ai./X, 
143 -
Signal from 
channel 1 
OR 
trifped 
Signal fron 
chanr^l 2 
OR 
FIGURE 12 F a u l t t r e e a n a l y s i s f o r p r e s s u r e t r i p 
- 144 -
O 
>-
L U 
> 
o 
L U 
CC 
ROUTINE 1 
ROUTINE 2 
ROUTINE 3 
ROUTINE 4 
GENERATE 
RECOVERY 
BLOCK 
SOFTWARE RECOVERY USING "RECOVERY BLOCKS" 
F i g u r e 13 
- 145 -
Q J3 
0) x: 
PQ o 
SB 
w 
> 
.H 
CO 
> 
•H o c <u 
o 
u 
o 
0 
T3 
0) 
O 
Si 
c 
O 
o 
T ) 
•H 
O c 
rH o 
2 
M 
o 
M 
o 
(0 
rH 
D 
tJfi 
(U 
•o <u ,-1 
<-\ 
o 
u 
4-1 c o o 
u 
o 
o 
E 
<U 
a 
a 
cn 
w 
K 
o 
M 
- 146 
PI 
Pressure 
transducer 
P2 
Pressure 
transducer 
A/D 
ccnv. 
ccntrol 
circuitry 
o 
CO w u o 
a, o 
PS 
o 
M 
2 
4k RAM 
I/O to solgx)ids 
watchdog 
FIGURE '.16 Block diagram o f non f a u l t - t o l e r a n t c o n t r o l l e r 
T l T2 I T3 1 X 4 I T l I I I Two synchronised 
microprocessors 
Third e r r a n t 
microprocessor 
FIGURE 17 Op-code f e t c h t i m i n g showing synchronisation problem 
RST 6.5 
HOLD 
10 
RST/HOLD c i r c u i t r y 
resets here 
» 
10 \ 
H 
clock cycles 
FIGURE 18 RST/HOLD Timing 
- 147 -
FIGURE 19 Photograph of governor controller 
FIGURE 20 Photograph of pneumatic t e s t r i g 
- 148 -
CLOCK 
RESET 
RESYNC, 
RS232 
INTERFACE 
a, o 
CM 
a, 
00 
D 
CL, o 
Data 
Control 
-SCO 
Address 
Data 
Coitrol 
Address 
Data 
Control 
cn 
H o > 
Data 
C c n t T Q l 
Error flags 
FIGURE 21 Block diagram o f Microprocessor board 
- 149 
DATA BUS 
OUTPUT • TO VOTERS TC'^^ 
iMili 
— ^ 
- A 
DATA BUS 
INPUT-
n-
1 < V 
VJ 
1 
-\ -I ^ ! S 
n i l ^ 1?! O J I 
•4 
w 
< 
U 
c c 
(0 
o 
ni o 
•H 
-P 
C 
0) 
•D 
•H 
tt) 
0) 
x; 
4-> 
o 
0) 
c 
u 
<H 
3 
c 
u o 
CM 
CvJ 
u cc 
C3 
H 
fa 
- 150 -
DATA BUS 
OUTPUT 
A / a A * » ^ j ^ A / ^ 
o 
J 
TO VOTERS 
t N 
+ 
->«wv-
o- J -
ID 
D 
o 
H 
3 :i ^  ^ i ^ S A - ^ i} I 5 § V w 3 i 
DATA BUS 
INPUT 
ft- 9- -
1 ^ ? ^  s 
3T: 
CM 
•J 
W 
< 
X 
O 
I 
i - i 
(U c c 
(0 x: o 
(0 o 
•H 
•P c 
0) 
0) 
-P 
< M 
O 
OJ c o 
(0 
0) 
< M 
3 
•a c 
CO 
U 
o 
v> m. 
u 
o 
a o 
u o 
•H 
ro 
CM 
(1] 
OS 
ZD 
o 
- 151 -
DATA BUS 
OUTPUT 
- rlfO 4 U >1 "t 
TO VOTERS 
2 i a s j i i i i 
> 
+ 
•««**v-
ID 
t-{ 
D 
o-
+ 
a. ^ ^^ ^^  irtii t ii -
f ^ i 
1 • ^ A «1 
X Z « 1< >C W 
.DATA BUS 
•INPUT • 
J 
0[ 
1-
CO 
•J 
w z 
< 
tt) c c 
(0 
x: 
o 
(0 
o 
•H •p c 
Q) 
•o 
Q) 
(D 
+-> 
t»-i 
o 
0) 
c 
0) 
( M 
D 
XI 
•a c 
o 
CO w 
0) 
o 
o 
u 
a 
o 
o 
CM 
W 
CC 
O 
M 
- 152 -
SYSTEM BUS - A l l FPLA outputs pulled-up t o +5V by I k r e s i s t o r s 
r 
^ 0 <t <t <c « < 3 3 n + rtN - O <t < <<t<C 
CD CM 
* < « » r l - 0 
-5 ^ 
SYSTEM BUS - A l l FPLA outputs pulled-up t o +5V by I k r e s i s t o r s 
V) > (4- U 4- «1 
CD 
4^ n 
IT) 
•2 3-1* 
•-MH 
mi 
'5 
>5 
u 
•p 
•H 
D 
O 
u 
o 
OD 
c o > 
in 
CO u a: 
o M U. 
- 153 -
FPLA 
OUTPUT 
OUTPUT FROM 
LOW PASS FILTER 
V e r t i c a l = 5V/cm 
Horiz. = 50ns/cm 
FIGURE 26 FPLA e r r o r s i g n a l showing the e f f e c t of f i l t e r i n g 
• • « 4 H r*t*-«'5--^H EXT CLK 
Test pulse 
1+2 
2+3 
Latch e r r o r f l a g s 
CLK OUT 
RST 6.5 
HOLD 
A L E ( l ) 
ALE(2) 
ALE(3) 
FIGURE 27 Re s y n c h r o n i s a t i o n timing monitored by l o g i c a n a l y s e r 
- 154 
D B — . 
L-AAA/ 
U22 
Self-syichronising clock c i r c u i t r y 
U 2 3 ^  V CiM 0 ) 
U23 
I a. /o U23 cucQ) 
LAO. 
m (1) 
iXL 
Syichronisation t e s t c i r c u i t r y 
+ SV 
I — Aix 7.s-(i) 
- «4T 7.r(3) 
RST 7.5 (Monitor) 
i n t e r r u p t switch 
U7 P U26 P — 
RS232 I n t e r f a c e 
U7 
FIGURE 28 Clock, t e s t , and RS232 i n t e r f a c e 
- 155 -
PCHT 2( Aj) 
PoltX I {LiK) 
U23d 
5^U28b 
PoitT O (W*) 
CJ.tr o u r 
•-p[|[^^m-,i-4^ 
UlBb 
( i U28c 
U21a 
1^1 \ 
U28d 
U27b 
Ha^ 0) 
HrSUb (X) 
«s.T 6.r (i) 
FIGURE 29 Resynchronisation c i r c u i t r y 
- 156 -
\te56.T 
U19a 
U18a 
/SWT 
FIGURE 30 Watchdogs and reset c i r c u i t r y 
- 157 -
DATA IN Hamming cocte 
mxrnm 
ADDRESS BUS 
INSTAOT 
16 X ^ 
RAM 
ACDRESS 
DECCDER 
Haimiing code 
DEOCDING 
DATA our 
ERROR FLAGS 
Msmory ctecxxlLng 
PORT(RD) decoding 
FIGURE 31 Block diagram o f memory board 
- 158 -
DATA BUS 
INPUT 
3 
on -P 
0) 
<D O 
I i 
DATA BUS OUTPUT 
w 
•a o o 
0) 
•o 
C 
CC 
< 
QO 
C 
•H •o 
3 
rH 
U 
C 
•H 
•a 
cc o 
X I 
>i 
o E tt) 
<w 
O 
4-' 
U CO CL 
C\J 
CO 
W K 
C5 
M 
fa 
- 159 -
U 
II 
If 
(OR 
U55 07 
U56 06 
U57 C5 
U58 C4 
U59 D7 
U60 D6 
U61 D5 
U62 D4 
U63 03 
U64 02 
U65 01 
U66 CO 
U67 D3 
U68 D2 
U69 Dl 
U70 DO 
10 I 
< a : : 
to 
IP II, 
la. it < 3 i : 
S',1 
SI,4 
FIGURE 33 D e t a i l of RAM storage c i r c u i t r y i n c l u d i n g t e s t c i r c u i t r y 
- 160 -
ADDRESS BUS 
2iow> - 2^FF I n s t a n t ROM 
I n s t a n t ROM Xtoo- XFPf 
OtOO-oPFF 
loeo-l'H=f-
/goo- («PP 
f>ofi.T ( (A» 
Potcr 3 lOi 
FIGURE 34 I n s t a n t ROM, EPROM, and PORT(RD) decoder c i r c u i t diagram 
- 161 -
I- <A 
< D 
ODN. 
lid 
J w o 
< s o W t 
cc -1 
Pu 
w 1 
H H 
< < m 
00 ,1. 
O H 
H H Q OS 
M D H W m CL CK < > S O 00 ^  H CL, CO
 
K td w X X C 5 w •J O 
M .4 cu 
Q M < M H 
C/0 • < 
MU
L 
w a, 
Pu o > w H M H O K 
S Q 
CO 
H H D 
M 0. m OS o 00 O a, 
CM 
73 -o 
•H -H O 0 
c c <u 
rH .-( o 0 
>-
EH U 
Q CJ < M O o DS 2 2 W D W > Q J H td O DS OS Q 
T3 
CO Cd 1 w DS O O 
DS Cd < 1 c D u 0) 
CAl D DS 1 iH 03 W Q Id 1 o 
U w C/3 H 1 o, D ; 2 2 1 a, < M • SH (U DS 1 o 4-> CD H o3 ' -P (0 1 -"-I U c 
o8 
DS X W 
M a > m H M 
< DS 00 •J Q 
d 
-p 
O 
> 
> 
•H 
u 
73 
U m o 
XI 
•p 
a 
-p 3 O \ 
•P 3 
Q. C 
. -H 
(tn O 
a 
u 
CO 
•H 73 
^: o o 
03 
00 
- 162 -
Ax. 
A| . 
AO. 
ioZ. . -I 
to 
2l l(.t kHi j[£. 
write protect' 
Hoxo-
— 4 1 m . — Q -
P4- — n w m — ^ 1 
0-(t.< 
Psr 
O-lf. 
[OK 
-itm 
lo 
S2 /viyVV 
e>.( 
CUTT'O -
r 
<5V +5V 
U2_ 
U80 
Au. (Ok-
J . 
it. 
/4 i 2 _ 
U S l i i 
131 
U81b 
lo 
10 \ 20 
/bAT-feliJ/l) 
-3>o 
-13 H 
H 
r> o 
< 
< Q 
- I t 
:D 
CQ 
< 
< 
FIGURE 36 Input/output board 
- 163 -
i i i 2k 
Po<rS-(L>f.). 
1. 
^ MUX- I 
L i - HO>rO 
6« ,6U> (lo, 3U> 
i ± — 
3 ^ j [ s O L E N O I D ^ 
Its 
61 , 6«o ^—vw 
'•'^  ! SOLENOID 1 ! 
FIGURE- 37 Stepper motor and redundant solenoid d r i v e c i r c u i t r y 
- 164 -
to ro 
5! 
i2 
PIEZO ^ 
TRANSDUCER 
•>wv 
-P 
•H 
D 
O I 
fn 
•H 
o 
00 c 
•H c o 
•H 
•p 
•H •c c o o 
X) 
c 
(0 
(U 
o 
D 
TD 
M 
c 
(0 
-p 
(U 
D 
w 
0) 
00 
CO 
w 
OS 
- 165 -
(D 
> 
i-H 
> 
u 
o -p 
(0 
r-1 
3 
GO 
(D 
> 
> 
XI 
•H O C (U 
rH 
o 
> 
> 
,-\ 
xs 
0) 
c; 
01 IK >^ 
O bli 
T3 Q) tn •H m (0 (U a Si a o 
B 
O o ® 
>> U 
•H 
Q 
a a, 
•H 
u 
+J 
(0 0 +J 
o 
•H 
•P 
(0 
e 
0) c a 
o 
£ 
QO 
(0 
•H 
T3 
^ 
O 
O 
iH 
CQ 
n 
w 
ZD 
o 
M 
- 166 
MAIN disable intempts and 
RSTADID 
in i t i a l i s e a l l RAM to FF 
CALL INITL - in i t i a l i s e a l l 
registers and variables 
leg reset and EPPm set 
in use 
enable RST/HDLD 
generate recovery block 
anable intempts 
reset watdidq^ 
CALL CNTRL - pressure 
ccntrol algpritlTm 
CALL PRBUFF - print a 
character fran buffer 
CALL SLFTST - self test of 
system and soleraids 
FIGURE 40 Flow chart o f module MAIN 
- 167 -
RESYNC disable RST/H3ID and 
reset stack pointer 
PUSH a l l r a s t e r s cnto 
stack 
POP a l l registers off 
stack 
reset SCX3 pin 
decrement retry counter 
enable interrupts and 
RSTADLD 
read current error f l a ^ 
count = 
zero 
YES 
leg transiait error 
generate RST 6.5 
in t e m p t 
halt and wait for 
internet 
YES 
switch out RST6.5 and 
disable RST/HDLD 
leg CPU failure 
leg : channel in error 
syndrome 
nurrter of retries 
reset retry counter in 
RAM 
reset syndrcme latch 
flip/flap 
execute vectored 
recovery 
FIGURE 41 Flow cha r t o f module RESYNC 
- 168 -
MERROR 
set syndrome = 00 
0 -
save registers 
read and save syndrome 
in i t i a l i s e retry counter 
read/write 00 to RAM 
NO 
NO 
test RST 5.5 pin 
YES 
read syndrome for 00 
read/write 
save syndrome and read/write 
FF to RAM 
read syndrome for FF 
read/write 
(£> 
YES NO 
decrement retry counter 
YES 
read/write test data to RAM 
leg DBE i n check bits YES 
NO log hardware failure 
'FIGURE 42 Flow ch a r t of module MERROR 
- 169 -
MERROR 
decrement retry counter 
JVP SFAIL 
soft failure without RAM 
leg single b i t failure 
either syndrome = zero or 
both syndromes equal 
YES cenditicn 
true ? 
NO 
log hardware failure 
JMP BLOCK 
vectored recovery 
test RST 5.5 pin 
Ipg hardware failure 
switch out RST 5.5 intempt 
log saved syndrcme, tijne 
and date 
YES NO 
log transient error 
restore registers and 
return 
FIGURE 43 Flow cha r t o f module MERROR 
- 170 -
CNTRL 
calculate error 
P - Pgov 
CALL PRESSR 
read Pgpv, Pout, Pdelta 
calculate desired pressure 
P = Pset + 2Pdelta 
YES 
YES 
NO 
NO 
calculate error 
Pgpv - P 
NO 
increase pressure hold pressure decrease pressure 
delay 
FIGURE 44 Flow ch a r t o f module CNTRL 
171 -
PRESSR 
decrement retry counter 
count = 
zero ? 
For each of 6 channels : 
store aveirage of 8 readings 
i n table 
For both groips of 3 
diarmels : calculate 3 
different averages 
A+B B+C AH;C and store i n 
2 2 2 table 
Compare each channel witii 
the average of the other 2 
|A - (B + C)/2| etc. 
YES 
YES 
log channel in error 
log time and date 
NO 
YES 
a l l 3 
channels 
compared 
return average of 3 
channels (A+B+C)/3 
return average of 2 
channels excluding channel 
in error 
FIGURE 45 Flow c h a r t of module PRESSR 
- 172 -
17 r 
16 
15 h 
14 
U 
(D 
3 
tfl 
o c 
-12 
> o bO a, 
11 r-
10 
Theoretical r e s u l t 
A = Measured value 
1.0 1.5 -2.0 2.5 
AP /inches water gauge 
3.5 
FIGURE 46 Graph o f governor o u t l e t pressure versus o r i f i c e p l a t e d i f f e r e n t i a l 
- 173 -
DATE 1.5:0.7 HR i e : l l 
VECTORED RECOVERY 
SYND=FA RETRIES=01 
SYNC ERR CHANNEL 01 
DATE 15:07 HR 13:17 
TRANS RAH ERR S=10 
DATE 15:07 HR 13:18 
RAM HHARE FAIL S=00 
DATE 15:07 HR 13:17 
CBIT STUCK DBE S=20 
DATE 15:07 HR 13:16 
PRESSR ERROR CH 03 
Message 1 
DATE 15:07 HR 13:15 
VECTORED RECOVERY 
SYND=DE RETRIES=FF Message 2 
CPU FAIL CHANNEL 01 
Message 3 
DATE 15:07 HR 13:26 Message 4 
RAH FAILURE S=10 
Message 5 
Message 6 
TOTAL RAH FAILURE Message 7 
DATE 15:07 HR 13:22 
VECTORED RECOVERY Message 8 
TRAP UDOG ADR=0639 
DATE 15:07 HR 13:26 
EPROH SET 00 Message 9 
FULL SYSTEH RESET 
DATE 15:07 HR 13:24 
EPROH SET 01 Message 10 
FULL SYSTEH RESET 
DATE 15:07 HR 13:19 
VECTORED RECOVERY Message 11 
SNAKE RST7 ADR=003D 
DATE 15:07 HR 13:30 
SOLENOID FAILURE Message 12 
Message 13 
FIGURE 47 Er r o r messages 
- 174 
TABLE 1 Failure mechanisms and activaticn aTec^gies in M36 semicoTductors 
Failure 
mechanism 
Type Activatien 
aiergy eV 
Detectien Preventive 
measures 
slow trapping wearout 1.0 h i ^ terp. bias ultra clean processing 
ccntaminaticn wearout/infant 1.4 h i ^ tarp. bias ultra clean processing 
surface charge wearout 0.5 - 1.0 h i ^ terp. bias ultra clean processing 
polarisation wearout 1.0 h i ^ tenp. bias eliminate ftosphorus 
in gate oxide 
electromigration wearout 1.0 high terp. operating 
l i f e 
J< 10^ A/cm* 
microcracks random - temperature cycling contoured oxide steps 
cmtacts wearout/infant - h i ^ terp. cperatir^ ultra clean processing 
oxide defects infant/random 0.3 h i ^ voltage operating 
l i f e and cell stress 
ultra clean processing 
aluminium 
corrosion 
wearout 0,8 leak detection/ 
moisture tests 
inprove packaging 
galvanic bend 
pad corrosicn i n 
plastic enc^JS. 
wearout 0.6 leak detection/ 
moisture tests 
inprove packggirig 
charge loss from 
N channel EPROVI 
wearout 0.8 h i ^ terp. tests iirproved design 
TABLE 2 A comparisai of resistor failure rates 
Source Failure rate f/10^ hrs. 
ICL data reference (68) 0.004 
Dumner reference(69) 0.015 Failure modes : 
NCSR reference (34) 0.007- 9C% open circuit 
W/o short circuit 
CNET reference(38) 0.0035 
MIL 217D reference(37) 0.0035 
MIL 21X reference(36) 0.0035 
Failure rates calculated for an oxide fil m resis-tor i n a conputer envircnment wi-th R< 100k 
Temperature = 25°C and stress = 0.1 (cperating wattle/rated wattage) 
- 175 -
TABLE 3 A comparison o f capacitor f a i l u r e r a tes 
Source 
6 
F a i l u r e r a t e f/10 hrs. 
ICL data reference(68) 0.003 
Dummer reference(69) 0.08 
NCSR reference(34) 0.021 
CNET reference(38) 0.0035 
MIL 217D reference(37) 0.004 
MIL 217C reference(36) 0.004 
Fa i l u r e modes : 
50% open c i r c u i t 
5 0% short c i r c u i t 
F a i l u r e r a t e s c a l c u l a t e d f o r a O.luF polycarbonate capacitor i n a computer 
environment. With Tamb.= 25''C and stress = 0.1 (operating voltage/rated) 
TABLE 4 A comparison of soldered j o i n t s f a i l u r e rates 
Source F a i l u r e r a t e f/10^ hrs. 
ICL data reference(68) 0.002 
Dummer reference(69) 0.008 
CNET reference(38) 0.0005 
MIL 217D reference(37) 0.0026 
MIL 217C reference(36) 0.0026 
TABLE 5 A comparison o f wire-wrap j o i n t f a i l u r e rates 
Source F a i l u r e r a t e f/10* hrs. 
ICL data reference(68) 0.0008 
Dummer reference(69) 0.0007 
CNET reference(38) 0.00001 
MIL 217D reference(37) 0.0000025 
MIL 217C reference(36) 0.0000025 . 
- 176 -
TABLE 6 A comparison of edge connector f a i l u r e rates 
Source F a i l u r e r a t e f/10* hrs. 
ICL data reference(68) 0.0030 
Dummer reference(69) 0.14 
CNET reference(38) 0.0030 
MIL 217D reference(37) 0.0001 
MIL 217C reference(36) 0.0001 
F a i l u r e r a t e s c a l c u l a t e d f o r a mating p a i r o f contacts connected once only 
w i t h o u t r e p e t i t i v e mating/un-mating, 
MIL 217 and CNET f a i l u r e r a t e s c a l c u l a t e d f o r a mating p a i r o f contacts 
w i t h i n a 64 way connector o f m a t e r i a l type B. 
TABLE 7 A c t i v a t i o n energies used i n f a i l u r e r a t e p r e d i c t i o n 
Technology Package type A c t i v a t i o n energy eV 
MIL 217D MIL 217C NCSR 
TTL HTTL Hermetic 0. 40 0. 40 0.40 
DTL ECL .non hermetic 0. 45 0. 40 0.40 
LTTL STTL hermetic 0. 45 0. 40 0.40 
non hermetic 0. 50 0. 40 0.40 
LSTTL hermetic 0. 50 0. 40 0.50 
non hermetic 0. 55 0. 40 0.50 
I ^ L MNOS hermetic 0. 60 0. 40 — 
non hermetic 0. 80 0. 40 -
PMOS hermetic 0. 50 0. 70 0.55 
non hermetic 0. 70 0. 70 0.55 
NMOS CCD hermetic 0. 55 0. 70 0.55 
non hermetic 0. 80 0. 70 0.55 
CMOS,CMOS/SOS hermetic 0. 65 0. 70 0.55 
and l i n e a r . non hermetic 0. 90 0. 70 0.55 
- 177 -
TABLE 8 A comparison o f TTL i n t e g r a t e d c i r c u i t f a i l u r e rates 
Source F a i l u r e r a t e f / i o ' hrs. Ea eV 
ICL data reference(68) 0.014 -
NCSR reference(34) 0.031 0.4 
CNET reference(38) 0.050 0.3 & 1.0 
MIL 217D reference(37) 0.084 0.45 
MIL 217C reference(36) 0.533 0.40 
F a i l u r e r a t e s c a l c u l a t e d f o r a p l a s t i c package containing 4 gates and i n a 
computer environment w i t h Tamb.= 25°C and Tjunct.= 33°C 
TABLE 9 A comparison of the f a i l u r e rates f o r a 6800 Microprocessor 
Source F a i l u r e r a t e f/10^ hrs. Ea eV Tj °C 
Motorola reference(39) 0.30 1.0 105 
MIL 217D reference(37) 7.80 0.55 120 
MIL 217C reference(36) 1380 0.70 120 
CNET reference(38) 6.60 0.3 & 
1.0 
120 
NCSR reference(34) 6.40 0.55 130 
F a i l u r e r a t e s c a l c u l a t e d f o r a hermetic package containing 1367 gates i n a 
"ground f i x e d " environment a t Tamb.=70°C 
0 j a s p e c i f i e d by r e l i a b i l i t y data source 
Pdiss = worst case = IW (except f o r Motorola = 0.5W t y p i c a l ) 
Q u a l i t y l e v e l = CI (approximately computer grade) 
- 178 -
TABLE 10 A comparison of 6800 Microprocessor adjusted f a i l u r e rates 
Source Failure rate f/10* hrs. 
Motorola reference(39) 0.013 
MIL 217D reference(37) 6.60 
MIL 217c reference(36) 280 
CNET reference(38) 1.3 ** 
NCSR reference(34) 4.01 
Failure rates converted to a common base of : 
Tamb. = 45°C 
0ja = 50°C/W 
Pdiss = 0.5W (t y p i c a l ) 
Tj = 70°C 
Ea = l.OeV 
Quality level = CI 
*• Since CNET data uses both 0.3 and l.OeV activation energies, i t i s not 
possible to convert to a single activation energy of l.OeV, however the 
junction temperature i s adjusted from 120°C to 70°C. 
TABLE 11 A comparison of the f a i l u r e rates for a 8080 Microprocessor 
Source Failure rate f/10* hrs. Ea eV Tj °C 
I n t e l reference(32) 0.12 0.50 89 typ.? 
MIL 217D reference(37) 4.2 0.55 106 
MIL 217C reference(36) 510 0.70 106 
CNET reference(38) 3.8 0.3 & 
1.0 
106 
NCSR reference(34) 10.8 ' 0.55 145 
RRE 36.2 -
Failure rates calculated f o r a hermetic package containing 1100 gates i n a 
"ground f i x e d " environment at Tamb.= 55°C 
0ja specified by r e l i a b i l i t y data source 
Pdiss = worst case 
Quality level = CI (approximately computer grade) 
- 179 -
TABLE 12 A comparison of 8080 Microprocessor adjusted f a i l u r e rates 
Source Failure rate f/10^ hrs. 
I n t e l reference(32) 0.01 
MIL 217D reference(37) 23.3 
MIL 217C reference(36) 833 
CNET reference(38) 2.0 ** 
NCSR reference(34) 13.9 
Failure rates converted to a common base of ; 
Tamb. = 45°C 
0ja = 50°C/W 
Pdiss = 0.78W (t y p i c a l ) 
Tj = 84°C 
Ea = l.OeV 
Quality level = CI 
** Since CNET data uses both 0.3 and l.OeV activation energies, i t i s not 
possible to convert to a single activation energy of l.OeV, however the 
junction temperature i s adjusted from 106°C to 84°C. 
TABLE 13 A comparison of the f a i l u r e rates of a 2716 EPROM 
Source Failure rate f / i o ' hrs. Ea eV 
I n t e l reference'(29) 0.45 @ 60% C.L. 0.30 
MIL 217D reference(37) 3.5 0.55 
MIL 217C reference(36) 34.5 0.70 
CNET reference(38) 0.64 0.3 & 
• 1.0 
NCSR reference(34) 1.09 0.55 • 
Failure rates calculated with : 
Tamb. = 55°C 
0ja = 25°C/W 
Pdiss = 0.525W 
Tj = 68°C 
Quality level CI 
"ground fixed"environment 
hermetic encapsulation 
- 180 -
TABLE 14 A comparison of the f a i l u r e rates of a Ik Bipolar ROM 
Source Failure rate f / i o ' hrs.. Ea eV 
I n t e l reference(30) 0.5 @ 90% C.L. 0.40 
MIL 217D reference(37) 0.85 0.45 
MIL 217C reference(36) 4.90 . 0.40 
CNET reference(38) 0.31 0.3 & 1.0 
NCSR reference(34) 0.34 0.40 
Failure rates calculated with 
Tamb. = 85°C 
9ja = 30°C/W 
Pdiss = 0.5W 
Tj = 100°C 
Quality level = CI 
"ground f i x e d " environment 
hermetic encapsulation 
TABLE 15 A comparison of the f a i l u r e rates of a 16k Dynamic RAM 
Source Failure rate f/lO*" hrs. Ea eV 
I n t e l reference(28) 0.27 @ a 60% C.L. 0.30 
MIL 217D reference(37) 8.0 0.55 
MIL 217C reference(36) 261 0.70 
CNET reference(38) 0.84 0.30 & 1.0 
NCSR reference(34) 4.9 0.55 
Motorola reference(40) 0.083 @ 60% C.L. 1.0 
Motorola reference(40) 1.5 @ 60% C.L. 0.3 
converted to Ea = 0.3eV 
Failure rates calculated with 
Tamb. = 70°C 
0ja = 30°C/W 
Pdiss = 0.4W 
Tj = 82°C 
"ground fixed"environment 
hermetic encapsulation 
Quality l e v e l = CI 
- 181 
TABLE 16 A comparison of the f a i l u r e rates of a Ik Static RAM 
Source Failure rate f/10* hrs. Ea eV 
I n t e l reference(31) 0.4 @ 90% C.L. 0.30 
MIL 217D reference(37) 0.68 0.55 
MIL 217C reference(36) 13.2" 0.70 
CNET reference(38) 0.22 0.3 & 1.0 
NCSR reference(34) 0.57 0.55 
Failure rates calculated with : 
Tamb. = 55°C 
0ja = 30°C/W 
Pdiss = 0.2W 
Tj = 61°C 
Quality level = Cl 
"ground f i x e d " environment 
hermetic encapsulation 
TABLE 17 The r e l a t i o n between cost, screening level and Quality factor 
MIL 217 
screen level 
Screening method Typical 
cost 
relative 
S MIL-M-38510 Class S 0 5 8 - 20 
S BS9400 Class A ** 0 5 
B MIL-M-38510 Class B 1 .0 
B BS9400 Class B . 1 .0 
B-1 MIL-STD-883 method 5004 Class B 3 .0 4 - 6 
B-2 Vendor equivalent of B-1 6 .5 
C MIL-M-38510 Class C 8 .0 
C BS9400 Class C 8 .0 
- BS9400 Class D 10.0 2 - 4 
C-1 Vendor equivalent of C 13.0 
C-1 BS9400 F u l l Assessment level 13.0 
D Commercial Hermetic 17.5 1. 0 
D-1 Commercial p l a s t i c 35.0 0. 5 
** Considerable differences exist i n screening specification between 
MIL-M-38510 Class S and BS9400 Class A. 
BS9400 Quality factors are those suggested i n reference(70) 
- 182 -
TABLE 18 Recommended maximum junction temperatures for Semiconductors 
Semiconductor device Maximum junction temperature 
Transistors (Si) Hermetic 100°C 
general purpose Plastic 8000 
Transistors (Si) Hermetic 110°C 
power pl a s t i c 90°C 
Diodes 100°C 
Linear I.C.s Hermetic 90 °C 
Plastic 70°C 
D i g i t a l I.C.s Hermetic lOCC 
TTL Plastic 70°C 
D i g i t a l I.C.s Hermetic 90°C 
CMOS Plastic 70°C 
- 183 
TABLE 19 Signals carried by back-plane Bus 
Bottom connector Top connector 
Pin No. A C A C 
1 OV OV -12V -12V 
2 D7(out) D7(in) 
3 D6(out) D6(in) +5V OV 
4 D5(out) D5(in) +5V OV 
5 D4(out) D4(in) +5V OV 
6 D3(out) D3(in) +5V OV 
7 D2(out) D2(in) +28V 
8 Dl(out) Dl(in) +28V 
9 DO(out) DO(in) 
10 SOD RESET 
11 CLK OUT CLK OUT^IO 
12 ALE SID 
13 lO/M 
14 WR 
15 RD 
16 A15 INTR 
17 A14 RST 5.5 
18 A13 RST 6.5 
19 A12 RST 7.5 
20 A l l TRAP 
21 AlO ROM SELECT PORT 7(WR) PORT 7(RD) 
22 A9 PORT 6(WR) PORT 6(RD) 
23 A8 PORT 5(WR) PORT 5(RD) 
24 A7 PORT 4(WR) PORT 4(RD) 
25 A6 PORT 3(WR) PORT 3(RD) 
26 A5 PORT 2(WR) PORT 2(RD) 
27 A4 PORT 1(WR) PORT l(RD) 
28 A3 PORT 0(WR) PORT 0(RD) 
29 A2 
30 Al SERIAL IN 
31 AO SERIAL OUT 
32 +5V +5V +12V +12V 
- 184 -
TABLE 20 Correction b i t s for SEC/DED Hamming code 
C3 C2 CI CO Bi t i n error 
0 0 0 0 No error 
0 0 0 1 CO 
0 0 1 0 CI 
0 0 1 1 DBE 
0 1 0 0 C2 
0 1 0 1 DBE 
0 1 1 0 DBE 
0 1 1 1 DO 
1 0 0 0 C3 
1 0 0 1 DBE 
1 0 1 0 DBE 
1 0 1 ,1 Dl 
1 1 0 0 DBE 
1 1 0 1 D2 
1 1 1 0 D3 
1 1 1 1 DBE 
Horizontal parity = 
ODD for single b i t error (SBE) 
EVEN for double b i t error (DBE) 
TABLE 21 Error position as indicated by error flags 
S2 SI SO E Error position 
0 0 0 0 No error 
0 0 0 1 RAM 0 
0 0 1 1 RAM 1 
0 1 0 1 RAM 2 
0 1 1 1 RAM 3 
1 0 0 • 1 RAM 4 
1 0 1 1 RAM 5 
1 1 0 1 RAM 6 
1 1 1 1 RAM 7 
0 0 1 0 Double b i t error 
- 185 -
APPENDIX 1 Ci r c u i t diagram of power supply 
The c i r c u i t diagram of the controller power supply is given i n figure(Aii 
below. 
24CiV A.C, 
+5V @ 5A 
3.9k 
78H05 
10 coajF 2.2k 
+28V @ 0.5A 7824 
2.2uF 
200 " 2.2uF 4 700uF 
4 700uF 
+12v @ lA 
4.7k 
4-1-aOuF 
-12V @ lA 
4.7k 
FIGURE Al 
- 186 -
APPENDIX 2 C i r c u i t board layout 
The layout of the three governor controller c i r c u i t boards is given i n 
the following figures A2-A4. 
- 187 -
Ul 
U2 
U3 
U4 
U5 
U6 
U7 
US 
U26 
U24 U25 
U9 
UIO 
U l l 
U12 
U13 
U27 
I U21 I U22 U28 
U20 { U23 
U14 
U15 
U16 
U17 
U29 
R=10k resistor network 
U18 
U19 
U30 
U31 
E E ] 
U32 
TOP VIEW 
FIGURE A2 Microprocessor board layout 
- 188 -
QC 
O 
o 
o o 
U47 
U54 
U39 
U38 
U37 
U41 
U42 
U43 
U48 
U44 
U34 
U33 
U49 U50 
U53 
U70 
U66 
U52 
U69 
U65 
U51 
U68 -
U64 
U36 U62 U61 U60 
U35 U58 U57 U56 
U40 U74 U73 U72 
U46 
U45 
U67 
U63 
U59 
U55 
U71 
S5 
l l l l l l l 
S2 
SI 
TOP VIEW 
Protect 
ROM 
Enable 
FIGURE A3 Memory board layout 
- 189 -
RTC Write protect 
U79 
U80 
U82 
U84 
U76 U75 
UBl "2 U77 ? 
•2 
U83 U78 
U85 
- - - - - - - - 1 
in « g g 
c 
5 
_ 
S3 
1K4 IR2 
TRl 
TOP VIEW 
+ 5V 
motor 
OV 
motor 
A 
B 
B 
solenoid 2 
] +28V 
solenoid 1 
FIGURE A4 Input/output board layout 
- 190 -
APPENDIX 3.1 
E R R O R F L A G S LOW PASS F ILTER CALCULATION 
Although the three processors are fed by a common clock, they will not 
run in exact synchronism because of different internal delays. 
Consideration of the 8085 timing given in referenceOS) shows that a 
maximum difference of 140ns is possible between processors on the address 
bus. Any difference between the three channels will cause transient dips 
in the error flags which must be smoothed out by the low pass filters. The 
50ns r ise time of the error flags must be added to the 140ns giving a 190ns 
maximum dip in the error flags. The voter flags must not be allowed to dip 
below the logic V level of 2.0V for a 250ns interruption which includes a 
safety margin added to the I90ns . 
Hence the time constant of the LPF is calculated by : 
V = Vo exp - ( t / C R ) where : Vo = 5V 
V = 2V 
2 = 5 exp - ( 2 5 0 n s / C R ) 
=?> C R = 270ns 
Using R=1l< ±5% this gives C=285pF for Rmin.=0.95k. Since the capacitor 
tolerance is ± 1 0 % , a 330pF capacitor is used. 
The low pass filter therefore consists of : 
FPLA output Ik Filtered error flags 
vAAA 
330pF 
- 191 -
APPENDIX 3.2 FPLA Programming Table 
The following table w i l l implement a 5 channel TMR voter including error 
flags when programmed into the FPLA. " 
- BIPOLAR FIELD PROGRAMMABLE.-.. 
ilOGIC ARRAY {16X48X8);^^ilf;! r. 
82S100-I.N • 82S101-I.N 
82SlQ(yT^)/8^?l^,Q,)l 
o H HI 
Z o 
>-
ffi 
a. 
S o u 
Ul 
m o 
I-z o p o 
Q. 
in 
i 
oc 
< 
CL 
Q ILJ N 
, —I 
o 
CD 
> 
CO 
cn 
X n X O 
= CO u. D O O 
UJ 
> 
UJ c 
o ' 
UJ 
QC 
UJ 
H 
< 
t-Z
UJ (o 
CO o 
16X48X8 FPLA PROGRAM T A B L E 
PROGRAM TABLE ENTRIES 
INPUT VARIABLE OUTPUT FUNCTION OUTPUT AC : T I V E L E V E L 
Im Im Don't Care 
Prod. Term 
Present in Fp 
Prod. Term Not 
Present in Fp 
Active 
High 
Active 
Low 
H L — (dash) A • (period) H L 
NOTE 
Enler 1-
P-terms 
-1 (Of unuse d inputs of used 
NOTES 
1. Entries independenl of output polarity. 
2. Enter 'A) for unused out&uls ot used P-terms. 
NOTES 
1 Polarity programmed once only 
2 Enter (HI lor alt unused outputs 
NO. 
10. 
11 
12 
13 
14 
15' 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
PRODUCT TERM-
INPUT VARIABLE-
1 
0 8 
ACTIVE LEVEL-
OUTPUT FUNCTION-
H 
LA LA 
A_ A i . 
A lA 7n: 
TT7 
• Inputund Output l ie ldsolunuscdPlmmsr.nn be left bl.ink Unused Inputs and outputs am r PL A tormcnnls lell llf>nllnq 
- 192 -
Appendix 4 
CALCULATION OF DELAY B E F O R E LATCHING ERROR F L A G S 
Component tolerances in the error flag low pass filters will c a u s e 
their outputs to change state at different times, therefore the error flags 
should not be latched until they have all had time to settle to their final 
value. The outputs of the filters will decay exponentially and, 
considering the extremes of component tolerances, will have time constants 
between the limits : 
tmin.= 0.95k x 297pF = 282ns 
tmax.= 1.05k x 363pF = 381ns 
The graph below shows the exponential decay of the outputs plotted for 
the two extremes of time constant. 
The TTL logic thresholds of '1*= > 2.0V and '0'= < 0.8V leave an 
indeterminate range of 0.8 - 2.0V. Taking the worst c a s e , the voting error 
is detected at time t1 when the output, having a time constant of 282ns. 
drops below 2V. However the other error flags must be allowed time to 
settle to the 0.8V level at a rate governed by the 381ns time constant. 
This level is reached at time t2. The difference t2 - t1 is the maximum 
possible delay between detecting an error and allowing the error flags to 
settle. 
- 193 -
t1 = 282 In 2.5 = 258ns 
t2 = 381 In 6.25 = 698ns 
Hence : t2 - t1 = 440ns 
Hence the monostable U18b is designed to give a delay of 440ns between 
detecting a voting error and latching the error flags. 
- 194 
APPENDIX 5 Encoding ROM data 
0000 00 17 2B 3C 4D 5A 66 71 8E 99 A5 B2 C3 D4 E8 FF 
0010 00 17 2B 3C 4D 5A 66 71 8E 99 A5 B2 C3 D4 E8 FF 
The above t a b l e i s programmed i n t o a 32 x 8 PROM and i s used to encode 
data i n t o the SEC/DED Hamming code. F o u r - b i t wide data i s encoded as 
four data b i t s and four check b i t s . 
- 195 -
APPENDIX 6 Decoding ROM data 
0000 00 01 03 02 05 02 02 19 07 02 02 2B 02 4D 8F 02 
0010 09 12 12 15 12 13 11 10 12 9F 5D 12 3B 12 12 17 
0020 OB 22 22 27 22 AF 6D 22 22 23 21 20 39 22 22 25 
0030 32 7D BF 32 37 32 32 IB 35 32 32 29 30 31 33 32 
0040 OD 42 42 CF 42 47 6B 42 42 45 59 42 41 40 42 43 
0050 52 7B 57 52 DF 52 52 ID 53 52 50 51 52 49 55 52 
0060 62 79 65 62 63 62 60 61 EF 62 62 2D 62 4B 67 62 
0070 71 70 72 73 72 75 69 72 72 77 5B 72 3D 72 72 FF 
0080 OF 82 82 CD 82 AB 87 82 82 99 85 82 83 82 80 81 
0090 92 97 BB 92 DD 92 92 I F 91 90 92 93 92 95 89 92 
OOAO A2 A5 B9 A2 Al AO A2 A3 ED A2 A2 2F A2 A7 8B A2 
OOBO B3 B2 BO B l B2 A9 B5 B2 82 9B B7 B2 3F B2 B2 FD 
OOCO C2 C3 CI CO D9 C2 C2 C5 EB C2 C2 C7 C2 4F 8D C2 
OODO D5 D2 D2 C9 DO Dl D3 D2 D2 9D 5F D2 D7 D2 D2 FB 
OOEO E7 E2 E2 CB E2 AD 6F E2 EO E l E3 E2 E5 E2 E2 F9 
OOFO F2 7F BD F2 DB F2 F2 F7 E9 F2 F2 F5 F2 F3 F l FO 
The above t a b l e i s programmed i n t o a 256 x 8 PROM and i s used to decode 
data from the SEC/DED Hamming code. E i g h t - b i t wide data, c o n s i s t i n g of 
four data b i t s and four check b i t s , i s decoded as four data b i t s and 
four e r r o r f l a g s . S i n g l e b i t e r r o r s i n the data are c o r r e c t e d and the 
e r r o r f l a g s s i g n a l the occurrence and p o s i t i o n of s i n g l e and double b i t 
e r r o r s . 
- 196 -
APPENDIX 7 Operation of the Real Time Clock (Extract frcm the Radiospares data sheet) 
T h e R S 58174 is a nnetal gate C M O S circuit that 
funct ions a s a real tinne clock and calendar in bus-
or ientated m i c r o p r o c e s s o r s y s t e m s . A n interrupt 
t imer is inc luded, w h i c h can be programmed to 
h a v e one of three t imes . The t ime base is obtained 
f r o m a R S 32.768 kHz crystal with t ime keeping 
d o w n to 2.2V for low power standby operation from 
bat ter ies . 
Application 
W h e n the s y s t e m is p o w e r e d up it is necessary to 
enter the correct data into the dev ice registers and 
start the clock running. The s e c o n d s , minutes, 
h o u r s , d a y s a n d m o n t h s counters are all parallel 
l o a d e d with data from the 4 bit data bus w h e n 
correct ly a d d r e s s e d , CS^ is low and a write data 
s t r o b e pulse is given. Data to be entered is set up on 
the 4 bit data b u s ; the a d d r e s s f rom Table 1 for the 
requ i red register is set on the 4 bit address b u s and a 
wr i te data s t robe pulse is sent. Ch ip select must be 
low dur ing write a n d read operat ions. All 
in format ion is entered in the s a m e w a y with the 
re levant a d d r e s s . T o start the clock running at the 
requi red instant a logic 1 iswri t ten to DBO at address 
14, l ikewise writ ing a 0 s tops the clock. Data can be 
read from a register by using the required address 
a s in T a b l e i and apply ing a read strobe pulse. The 
da ta b e c o m e s avai lable on the 4 bit data bus: chip 
s e l e c t m u s t be low for this read operat ion. 
T h e internal counters are ar ranged as bytes of four 
bits e a c h . W h e n a byte is a d d r e s s e d it will appear on 
the data I/O b u s enabl ing independent a c c e s s . For 
b y t e s w h i c h do not conta in 4 bits (e.g. week days use 
on ly 3 bits) the u n u s e d bits are not recognised 
d u r i n g a wri te operat ion and tied to V s s during a 
read operat ion. The a d d r e s s a b l e reset latch holds 
t e n t h s , uni ts and tens of s e c o n d s in a reset 
cond i t ion . If a register is updated during a read 
opera t ion the I/O data is prevented from updating 
a n d a s u b s e q u e n t read will return the illegal b.c.d. 
c o d e 1111. T h i s a l lows detection that the previous 
da ta had c h a n g e d a n d is n o w incorrect. The 
interrupt t imer c a n be p r o g r a m m e d for 0.5,5 or 60 
s e c o n d intervals and may be coded for single or 
repeated operat ion. T h e open drain interrupt output 
is pu l led to V s s w h e n the t imer t imes out and 
read ing the interrupt register provides the status 
a n d internal se lec ted information. 
Standby mode 
T h i s is automat ical ly se lected w h e n the supply 
vo l tage fails to the s tandby level. (2.2V min imum) 
wi th no read or write s t robes . 
Test modg 
T h i s m o d e is used in production testing of the 
c i rcui t . For normal operat ion the circuit must be in 
non- tes t m o d e and set as part of the system 
init ia l isat ion. Non test m o d e is se t by writ ing a logic 
Oto D B 3 at ADO. 
Years States Register 
This i s a 4 bit shift register vMch i s 
shifted each year cn the 31st December. 
Ihe status register must be set in 
accordance with the table below and ho 
readout c^jability i s available. 
DB3 DB2 DBl DBO 
L e ^ year 1 0 0 0 
leap year + 1 0 1 0 0 
Leap year + 2 0 0 1 0 
Leap year + 3 0 0 0 1 
TABLE 1 
S e l e c t e d counter Address bits l^ode 
ADS AD2 AD1 ADO 
0 Test only 0 0 0 0 Write only 
1 Tenths of s e c . 0 0 0 1 R e a d only 
2 Units of sees . 0 0 1 0 R e a d only 
3 T e n s of sees . 0 0 1 1 R e a d only 
4 Units of nnins. 0 1 0 0 R e a d or Wr !e 
5 Tens of mins. 0 1 0 1 R e a d or Wr ' e 
6 Units of flours 0 1 1 0 R e a d or Wr.te 
7 T e n s of tiours 0 1 1 1 R e a d or Wnle 
8 Units of days 1 0 0 0 R e a d or Wf.tn 
9 T e n s of days 1 0 0 1 R e a d or Wnie 
10 Day of week 1 0 1 0 R e a d or Write 
11 Units of months 1 0 1 1 R e a d or Write 
12 T e n s of months 1 1 0 0 R e a d orWr^te 
13 Y e a r s 1 1 0 1 Write only 
14 S t o p / S t a n 1 1 1 0 Write only 
15 Interrupt and status 1 1 1 R e a d or Write 
- 197 -
A p p e n d i x 8 
D E S I G N O F R E D U N D A N T SOLEISIOID D R I V E R C I R C U I T R Y 
T h e s o l e n o i d is ra ted a s fo l lows : 
V n o r m a l = 24V 
Vhold = 2.6V 
V p u l l - i n = 8.0V 
R e s i s t a n c e = 2 8 7 ^ 
A n oc ta l dr iver is u s e d to cont ro l two s o l e n o i d s a n d the c i rcu i t d i a g r a m 
for o n e s o l e n o i d u s i n g half of the dr iver is s h o w n below : 
T h e a i m of the d e s i g n is that a n y of the c o m p o n e n t s exc lud ing the s o l e n o i d 
c a n fa i l , whi ls t still b e i n g a b l e to s w i t c h the s o l e n o i d ON a n d O F F . T h e 
r e s i s t o r v a l u e s R 2 - R 5 m u s t be c h o s e n to give the m a x i m u m p o s s i b l e vol tage 
s w i n g at the i r Junct ion with R l . 
T h e potent ia l d iv ider is s impl i f ied to : 
V c c 
- 198 -
With t h r e e t r a n s i s t o r s off a n d o n e fa i led shor t c i rcu i t , the vol tage VI is 
g iven by : 
V I = x V c c 
1 + x 
C o n v e r s e l y , with t h r e e t r a n s i s t o r s ON a n d o n e fa i led o p e n c i rcu i t , the 
vo l tage V2 is g iven by : 
V2 = x / 3 V c c = X V c c 
1 + x / 3 3 + X 
H e n c e : ' Av = V I - V2 = x - x _ = V ' 
V c c V c c 1 + X 3 + X 
F o r m a x i m u m Av d y ' = 0 
dx 
H e n c e : x 1.73 
Vmax . = 0 .268 V c c 
V c c w a s c h o s e n to b e 2 8 V w h i c h g i v e s Vmax.= 7.5V 
A su f f i c ien t s w i n g to turn the s o l e n o i d s ON a n d O F F is thus e n s u r e d by 
c h o o s i n g V c c = 28V. but th is vo l tage is suff ic ient ly low s o a s not to d a m a g e 
the s o l e n o i d d u e to o v e r v o l t a g e , s h o u l d R l fail. 
T h e m i n i m u m a c c e p t a b l e V is g iven by : 
Vmin.= V p u l l - i n - Vhold 
= 8.0 - 2.6 = 5.4V 
T h e r e f o r e the vo l tage s w i n g of 7.5V is 2V g r e a t e r than n e c e s s a r y a n d the 
e x c e s s c a n be u s e d to prov ide a safe ty m a r g i n . T h e safety marg in will be 
r e d u c e d by the load ing e f fec t of the s o l e n o i d on the potential divider. 
- 199 -
T h e z e n e r d i o d e is u s e d to s u b t r a c t a b i a s f rom the vol tage s w i n g in 
o r d e r to sa t is fy the s o l e n o i d O N a n d O F F l imits. C o n s i d e r i n g the d i a g r a m 
be low for t h r e e t r a n s i s t o r s O F F a n d o n e fa i led shor t c i rcu i t : 
+28V 
1.73R 
VS 
V2 
T h e vo l tage at the potent ia l d iv ider junc t ion is g iven by 
V = 1 .73R X 2 8 = 17.7V 
R+ 1 .73R 
T o e n s u r e that the s o l e n o i d Is O F F . a n d Inc lud ing a safety marg in of 0.5V. 
it is r e q u i r e d that Vs= 2.1V. 
H e n c e : Vz = (28 - 17.7) - 2.1 = 8.2V 
S o a n 8.2V z e n e r d i o d e is u s e d . 
T h e e q u i v a l e n t s e r i e s r e s i s t a n c e of the s o l e n o i d a n d z e n e r d i o d e is 
a p p r o x i m a t e l y SOOxj . . s o R l w a s c h o s e n to be about a tenth of this va lue s o 
that the potent ia l d iv ider Is not s h u n t e d too heavi ly by the load it i s 
dr iv ing . T h e v a l u e c h o s e n w a s R l = 6 8 - n a n d R 2 - R 5 a r e g iven by the 
re la t ion : 
R 2 = 1.73 X R l = 1.73 x 68 ^ 120 . /1 
S o R 2 -r R 5 w e r e c h o s e n to be 1 2 0 J ^ 
- 200 
T h e c o m p l e t e c i r c u i t d e s i g n Is now g iven by : 
+28V 
6 8 -
1 2 0 > 1 2 0 > 1 2 0 > 120 
R= 2 8 7 
8.2V 
U s i n g the a b o v e c i r c u i t , the v o l t a g e s a c r o s s the s o l e n o i d a r e c h e c k e d . 
With t h r e e t r a n s i s t o r s O N the s o l e n o i d vol tage = 8.7V a n d with . t h r e e 
t r a n s i s t o r s O F F the s o l e n o i d vo l tage = 1.9V. T h i s g i v e s a ' 0.7V safety: 
m a r g i n on both s o l e n o i d l imits. T h e c i rcu i t s h o u l d b e . se t up a s fol lows 
b e c a u s e of t o l e r a n c e v a r i a t i o n s in the r e s i s t o r s a n d z e n e r d iode . T h e 
p o w e r s u p p l y is r e d u c e d f rom 28V until the s o l e n o i d c e a s e s to swi tch ON a n d 
O F F u s i n g t h r e e of t h e t r a n s i s t o r s . T h e vo l tage is r e c o r d e d a n d is e q u a l ; 
to V m i n . T h e p o w e r s u p p l y is then i n c r e a s e d a b o v e 28V until the s o l e n o i d 
a g a i n c e a s e s to func t ion c o r r e c t l y . T h i s vol tage is" e q u a l to Vmax.' 
F i n a l l y the p o w e r s u p p l y is s e t m i d - w a y b e t w e e n Vmin . a n d Vmax. 
T o pro tec t a g a i n s t fa i lure o p e n c i rcu i t of the z e n e r d iode , two d i o d e s 
a r e u s e d in p a r a l l e l . F a i l u r e shor t c i rcu i t of the t z e n e r d iode is not 
fatal a s long a s no o ther c o m p o n e n t s - fal l . 5 • 
C o n s i d e r a t i o n of the a b o v e c i rcu i t ry s h o w s that any cornponent 
e x c l u d i n g the s o l e n o i d c a n fa i l , whi ls t still be ing a b l e to ' swi tch the 
s o l e n o i d O N a n d O F F . . , 
- 201 -
APPENDIX 9 
L i s t o f I n t e g r a t e d C i r c u i t s 
Ul - 82S101 U54 74LS138 
U2 - 82S101 U55 - 2141 
U3 - 82S101 U56 - 2141 
U4 - 82S101 U57 - 2141 
US - 82S101 U58 - 2141 
U6 - 82S101 U59 - 2141 
U7 - 1489 U60 - 2141 
U8 - 8212 (1) U61 - 2141 
U9 - 8085 (1) U62 - 2141 
UIO - 8085 (2) U63 - 2141 
U l l - 8085 (3) U64 - 2141 
U12 - 8216 (1) U65 - 2141 
U13 - 8216 (1) U66 - 2141 
U14 - 8216 (2) U67 - 2141 
U15 - 8216 (2) U68 - 2141 
U16 - 8216 (3) U69 - 2141 
U17 - 8216 (3) U70 - 2141 
U18 - 74LS123 U71 - 74LS32 
U19 - 74LS123 U72 - 74LS32 
U20 - 74LS04 U73 - 74LS32 
U21 - 7'4LS10 U74 - 74LS32 
U22 - 74LS138 U75 - 74LS09 
U23 - 74LS08 U76 - 74LS00 
U24 - 8212 (2) U77 - 58174 
U25 - 8212 (3) U78 - 8216 
U26 - 1488 U79 - DG508 
U27 - 74LS390 U80 - ZN427 
U28 - 74LS00 U81 - 74LS74 
U29 -• CA3046 U82 - 74LS244 
U30 - 74LS367 U83 - 74LS138 
U31 - 74LS374 U84 - 74LS273 
U32 - 74LS74 U85 - 8232 
U33 - 74S188 U86 - LM324 
U34 - 74S188 U87 - LM324 
U35 - 74S371 U88 - LM324 
U36 - 74S371 U89 - LM324 
U37 - 74LS374 U90 - LM324 
U38 - 81LS97 U91 - LM324 
U39 - 74LS32 
U40 - 74LS74 
U41 - 74LS138 
U42 - 74LS139 
U43 - 74LS86 
U44 - 74LS08 
U45 - I n s t a n t ROM 
U46 - I n s t a n t ROM 
U47 - 2516 
U48 - 2516 
U49 - 2516 
U50 - 2516 
U51 - 8216 
U52 - 8216 
U53 - 81LS97 
202 
APPENDIX 10 Sovernor C o n t r o l l e r Software L i s t i n g s 
The c o n t r o l l e r software i s w r i t t e n as nineteen modules which are l i n k e d 
together to form the complete software package. The module l i s t i n g s are 
as f o l l o w s : 
- 203 -
;NAIN CALLING PR06RAH 
NAME CHAIN ) 
CSEE 
EXT HTRAP,DFAULT,HERROR,RESYNC.SNAKE, INITL.MSGE 
EU TIHLOG,RBKGEN, CNTRL,PRBUFF,SLFTST,NHOUT 
MAIN: 
HAINl: 
FILL: 
I1AIN2: 
PUBLIC HAIN,STACK 
ORG OOOOH 
DI DISABLE INTERRUPTS 
OUT OOH DISABLE RST/HOLD 
JMP HAINI 
ORG 0024H 
JNP WTRAP JRAP WATCHDOG VECTOR 
ORG 002BH 
JHP DFAULT ,RST5 SOFTUARE ERROR VECTOR 
ORG 002CH 
JMP HERROR RST5,5 HEHORY ERROR VECTOR' 
ORG 0034H 
JNP RESYNC RST6.5 VOTING ERROR VECTOR 
ORG 0038H 
JHP SNAKE RST7 SNAKE VECTOR 
HVI COFFH INITIALISE ALL RAH TO FF FOR SNAKE 
LU H,3000H ;START OF RAH 
HOV H,C 
INK H 
MOV A,H 
CP I 40H ;END OF RAH? 
JNZ FILL 5LOOP UNTIL ALL RAH FILLED 
in SP,STACK INITIALISE STACK 
CALL INITL INITIALISE REGISTERS AND VARIABLES 
LXI H,CLDHSG LOG COLD RESET 
CALL MSGE 
Lll H,ROMHSG LOG EPROH SET IN USE 
CALL HSGE 
LDA 0007H READ NUHBER FROM EPROH 
CALL NHOUT PRINT IT 
CALL TIHL06 AND TIHE 
OUT 02H ENABLE RST/HOLD 
CALL RBKGEN GENERATE RECOVERY BLOCK 
EI ENABLE INTERRUPTS AFTER RBKGEN! 
OUT 06H RESET WATCHDOGS 
CALL CNTRL , CONTROL ALGORITHH 
CALL PRBUFF PRINT A CHARACTER FROH BUFFER 
CALL SLFTST SELF TEST - SOLENOIDS EVERY 10 HINS. 
JMP HAIN2 . REPEAT CALLING LOOP 
CLDHSG: DB OAH,TULL SYSTEM RESET',OOH 
R0HHS6: DB OAH,'EPROM SET '.OOH 
STACK EQU 3F00H 
END 
- 204 -
;RESYNCHRONISATION ROUTINE - RST6.5 INTERRUPT 
I DESTROYS ALL REGS. AND EXECUTES RECOVERY BLOCK 
NAHE C R E S Y N C ) 
CSEG 
EXT HSBE,NHOUT,BLOCK,STACK,COUTBF 
PUBLIC RESYNC.SRETRY 
RESYNC: 
ERRLOG: 
OUT OOH 
LXI SP,STACK 
PUSH PS« 
PUSH H 
LXI H,OOOOH 
DAD SP 
PUSH H 
PUSH B 
PUSH D 
POP D 
POP B 
POP H 
SPHL 
POP H 
POP PSK 
HVI A,40H 
SIH 
LXI H.SRETRY 
OCR n 
JZ DISABL 
EI 
OUT 02H 
IN 02H 
ANI 03H 
CPI 03H 
JZ ERRLOG 
OUT 07H 
HLT 
LXI H,EMS6E 
jnP NOCHAN 
;DISABLE RST/HOLD - QUICKER EXECUTION 
;RESET STACK POINTER 
iRESYNCHRONISATION ROUTINE 
;PUSH STACK POINTER 
;REVERSE THE PROCESS 
;RESTORE STACK POINTER 
;RESET SOD LINE 
DECREMENT RETRY COUNTER HELD IN RAH 
DISABLE RST 6.5 IF RETRY EXHAUSTED. 
ENABLE INTERRUPT FOR HARDWARE RETRY 
ENABLE RST/HOLD 
GET CURRENT SYNDROME 
HASK OFF ERROR FLAGS 
TEST FOR BOTH FLAGS HIGH 
NO ERROR SO LOG TRANSIENT 
GENERATE HARDWARE RSTi.5 INTERRUPT 
WAIT FOR INTERRUPT 
LOG TRANSIENT ERROR 
DISABL: RIM 
ANI 07H 
OR I OAH 
SIR 
OUT OOH 
LXI H,DHSGE 
; SWITCH OUT RST 6.5 
;DISABLE RST/HOLD 
;LOG CPU FAILURE 
iNOCHAN: CALL MS6E 
IN OiH 
ANI 03H 
LXI HJABLEl 
ADD L 
MOV L,A 
MOV A,M 
CALL NHDUT 
MVI C.OAH 
CALL COUTBF 
GET LATCHED ERROR SYNDROME 
STRIP OFF STATUS BITS 
LOOK UP TABLE FOR CHANNEL IN ERROR 
jPRINT IT 
205 -
LXI H,SYMHSG 
CALL nSGE 
IN OiH 
CALL NHOUT 
LXI H.RETHSG 
CALL HSGE 
LXI H,SRETRY 
HVI A,OFFH 
SUB M 
CALL NHOUT 
;LOG SYNDROHE 
;GET LATCHED SYNDROHE 
;PRINT IT 
I LOG NUHBER OF RETRIES 
END: HVI A.OFFH 
STA SRETRY 
OUT OIH 
JHP BLOCK 
RESET RETRY COUNTER 
STORE IN RAH 
RESET SYNDROHE LATCH FLIP/FLOP 
JUHP TO RECOVERY BLOCK 
EHSGE: DB OAH.'SYNC ERR CHANNEL ',00H 
DHSGE: DB OAH.'CPU FAIL CHANNEL ',00H 
SYNMSG: DB 'SYND=',00H 
RETHSG: DB ' RETRIES=',00H 
TABLEI: DB 02H,03H,01H,00H 
DSE6 
SRETRY: DS 1 
END 
- 206 -
**»••««*•*•**«»•»*••»»»*••#**••»•»»••»»••»»••»••••••••••»••• 
MEMORY ERROR ROUTINE - RST 5.5 INTERRUPT 
SAVES ALL REGS. UNLESS DBE - RECOVERY BLOCK IF DBE 
HERROR: 
NREAD: 
RWLOOP: 
RWOl: 
AGAIN: 
UPDATE: 
NAME CMERROR' 
CSEG 
EXT MS6E,NM0UT 
PUBLIC HERROR 
PUSH PSW 
PUSH H 
PUSH B 
PUSH D 
IN OOH 
NOV C,A 
MVI B.MRETRY 
LXI HJSTLOC 
XRA A 
MOV M,A 
MOV A,M 
ORA A 
JNZ AGAIN 
RIM 
ANI lOH 
JZ RWOl 
IN OOK 
MOV E,A 
HVI A,OFFH 
MOV M,A 
MOV A,H 
INR A 
JZ UPDATE 
DCR B 
JNZ RWLOOP 
LXI H.SFMSG 
JMP SFAIL 
CALL SYNRD 
MOV D,A 
ORA E 
JZ ERRLOG 
IN OOH 
DCR B 
JNZ WREAD 
MOV C,E 
CALL DBE 
JZ DBETST 
MOV C,D 
CALL DBE 
JZ DBETST 
MOV A,E 
ORA A 
JZ RFAIL 
MOV C,E 
MOV A,D 
ORA A 
JZ RFAIL 
,BLOCK,COUTBF,SFAIL,TIHLOG 
;SAVE REGISTERS 
;SAVE LATCHED SYNDROME h RESET LATCH 
;SAVE SYNDROME IN C REG. 
;INITIALISE RETRY COUNTER 
;IS WRITE/READ TO TESTLOC OK 
;CLEAR A 
5WRITE TO TESTLOC 
;READ BACK 
;SET FLAGS 
;TRY AGAIN IF READ ERROR 
;IS RST5.5 INTERRUPT PENDING 
;JUHP IF NO RST5.5 WITH ZERO IN A REG 
;OTHERWISE READ SYNDROME 
;SAVE SYNDROME IN E REG. 
;READ/WRITE FF 
;WRITE TO TESTLOC 
;READ BACK 
;FF BECOMES 00 
;0.K. SO CHECK SYNDROME 
;DECREMENT RETRY COUNTER 
;TRY AGAIN IF NOT EXHAUSTED 
;LOG TOTAL RAM FAILURE 
;PRINT MESSAGE AND SOFT FAILURE 
;READ CURRENT SYNDROME 
iSAVE FF.W/R SYNDROME 
;0R IN 00 W/R SYNDROME 
;LOG TRANS ERROR IF BOTH SYNDROHES=00 
{CLEAR SYNDROME LATCH 
;DECREMENT RETRY COUNTER 
;RETRY IF NOT EXHAUSTED 
;00 SYNDROME = DBE ? 
;R/U O.K. - EITHER HARDWARE OR DBE IN CHECK BITS 
;FF SYNDROME = DBE ? 
;DBE IN CHECK BITS OR HARDWARE 
;NOT DBE SO ARE 00 ,FF SYNDROME EQUAL OR IS ONE 
;/EQUAL TO ZERO 
;JUMP FOR STUCK AT SINGLE BIT ERROR 
;E SYNDROME NOT ZERO OR HARDWARE FAULT SO LOAD 
;/C REG WITH SYNDROME TO BE LOGGED 
;JUHP FOR STUCK AT SBE 
- 207 
CHP E - ;TEST FOR SYNDROHE EQUALITY ^ ' 
JNZ HWFLTY ;HUST BE HARDWARE FAULT '/ j'-
RFAIL: LXI H.FHSGE ;LOG STUCK AT SBE • •^ 
JHP DISABL 
DBETST: LXI -HJSTDTA ;TABLE OF TEST DATA TO READ/WRITE 
NXTDTA: nov A,H ;GET BYTE OF TEST DATA 
GRA A ;SET FLAGS 
JZ HUFLTY ;END OF TABLE - MUST BE HARDWARE FAULT 
STA TSTLOC ;WRITE TEST DATA TO TESTLOC 
IN OOH ;CLEAR SYNDROHE LATCH 
LDA TSTLOC ;READ BACK TEST DATA 
CALL SYNRD ;READ SYNDROME 
ORA A ;SET FLAGS 
INX H ;INCREHENT TEST DATA TABLE POINTER 
JNZ NXTDTA ;JUNP IF SYNDROME IN A REG NOT ZERO 
LXI H.DBFMSG ;OTHERWISE LOAD DBE MESSAGE 
JHP DISABL ;L06 HESSAGE AND DISABLE R5T5.5 
ERRLOG: RIH 
ANI lOH ;RST 5.5 PENDING ? 
JZ TRANS ;INTERRUPT NOT STUCK SO HUST BE TRANSIENT, 
DCR 6 
JNZ MREAD ;PERFORH RETRY 
HNFLTY: LXI H,HlfHSGE jHEHORY HARDWARE FAILED 
DISABL: RIH ;SWITCH OUT RST 5.5 
ANI 07H 
ORI 09H 
SIM 
JHP SNDHSG ;LOG FAILURE 
TRANS: LXI H,EHSGE ;LOG TRANSIENT. ERROR 
SNDHSG: CALL NSGE 
HOV A,C ;6ET SYNDROME 
CALL NNOUT ;PRINT SYNDROHE 
CALL TINLOG ;LOG TIHE 
CALL DBE ;IF ERROR=DBE THEN RETURN VIA RECOVERY BLOC 
JZ HBLOCK 
POP D ;RESTORE REGS. AND RETURN 
POP B 
POPH 
POP PSW 
EI 
RET 
HBLOCK: E I - ' ' 
JHP BLOCK ;JUHP TO RECOVERY BLOCK SINCE DBE 
SYNRD: RIN 
ANI lOH ;TEST FOR PENDING RST5.5 
RZ ;OTHERWISE RETURN A=00 
IN OOH ;READ SYNDROHE LATCH 
RET 
DBE: HOV A,C ;ROUTINE RETURNS WITH ZERO FLAG SET IF DBE 
ANI OFH 
CPI 02H 
RZ 
HOV A,C 
ANI OFOH i 
CPI 20H 
RET 
208 
SFMSG: DB OAH,'TOTAL RAM FAILURE'.OAH.OOH 
FMSGE: DB OAH,'RAH FAILURE S=',00H 
EMSGE: DB OAH,'TRANS RAM ERR S=',OOH 
HWMSGE: DB OAH,'RAM HWARE FAIL S=',OOH 
DBFMSG: DB OAH,'CBIT STUCK DBE S=',OOH 
TSTDTA: DB 33H,55H,66H,99H,0AAH,0CCH,00H 
MRETRY EQU 64H 
DSEG 
TSTLOC: DS 1 ;RESERVE 1 BYTE FOR TESTLOC 
END 
- 209 -
NAIN CONTROL ALGORITHH - DAG PRESSURE CONTROL 
NAHE (•CNTRL") 
CNTRL: 
DOWN: 
UP: 
BWIDTH: 
HOLD: 
SOLOUT: 
DELAY: 
MAX 
PSET 
CSEG 
EXT P60V,P0UT 
PUBLIC CNTRL 
CALL PRESSR 
XRA A 
LDA PDELTA 
RAL 
ADl PSET 
HOV C,A 
LDA PGOV 
CHP C 
JZ HOLD 
JC UP 
SUB C 
HVI D,0FFH 
JMP BWIDTH 
HOV B,A 
HOV ft,C 
SUB 6 
MVI D,00H 
CPI MAX 
JC HOLD 
HOV A,D 
JHP SOLOUT 
HVI A,0FH 
OUT 04H 
LXI H.iOOOH 
DCR L 
JNZ DELAY 
DCR H 
JNZ DELAY 
RET 
EQU OSH 
EQU 90 
END 
,PDELTA,PRESSR 
READ PRESSURE TRANSDUCERS 
CLEAR CARRY 
GET PRESSURE DIFFERENTIAL ACROSS PLATE 
iHULTIPLY BY 2 
CALCULATE PSET+2PDELTA 
SAVE IN C REG. 
GET GOVERNOR PRESSURE READING 
COHPARE WITH PSET+2PDELTA 
HOLD PRESSURE IF EQUAL 
PGOV LOW SO INCREASE PRESSURE 
DECREASE PRESSURE AND CALC. PRESSURE ERROR 
SOLENOID DOWN CONTROL WORD 
NO ACTION IF PRESSURE ERROR WITHIN LIHITS 
INCREASE PRESSURE - REVERSE SUBTRACTION 
CALCULATE PRESSURE ERROR 
SOLENOID UP CONTROL WORD 
IS PRESSURE ERROR LESS THAN MAX. 
HOLD PRESSURE IF SO 
OTHERWISE ADJUST PRESSURE UP OR DOWN 
OUTPUT TO SOLENOID DRIVERS 
SOLENOID HOLD CONTROL WORD 
OUTPUT TO SOLENOID PORT 
DELAY 
;0.5 INCHES W.G. 
!? INCHES W.G. 
- 210 -
;TAKE TIME AVERAGE OF EACH PRESSURE TRANSDUCER, THEN PERFORH 
;HAJORITY VOTE AND RETURN VALUES - PGOV, POUT, PDELTA 
.•DESTROYS ALL RE6ISTERS 
NAHE CPRESSR') 
CSES 
EXT NSGE,NHOUT,TIHLOe 
PUBL1C PRESSR,PRETRY,PGOV,POUT,PDELTA 
PRESSR: 
SCAN: 
VOTE: 
LXI H.PTABLE 
HVI B,05H 
CALL PAVE 
HOV fl,A 
INK H 
DCR B 
JP SCAN 
LXI H,PTABLE+2 
LXI D,PTABLE+6 
CALL AVRGE2 
LXI H,PTABLE+5 
LXI D,PTABLE+9 
CALL AVRGE2 
LXI H,PTABLE+3 
LXI D,PTABLE+9 
HVI C,OOH 
CALL CHECK 
STA PGOV 
LXI H,PTABLE 
LXI D,PTABLE+6 
HVI C,03H 
CALL CHECK 
STA POUT 
HOV C,A 
LDA PGOV 
SUB C 
STA PDELTA 
RET 
INITIALISE PRESSURE READINGS TABLE PNTR. 
CHANNEL COUNTER 
TIHE AVERAGE FOR 8 READINGS OF CHANNEL 
SAVE IN TABLE 
INCREHENT TABLE POINTER 
DECREHENT CHANNEL COUNT 
LOOP UNTIL FINISHED 
START AT END OF TABLE - WORK BACKWARDS 
STORE AVERAGES IN TABLE 
.CALCULATE 3 DIFFERENT AVES. - CHANNELS 3-5 
REPEAT FOR CHANNELS 0-2 
; TABLE FOR CHANNELS 0-2 
{CORRESPONDING AVERAGES 
;CHANNELS OFFSET 
jCOHPARE CHANNELS AND AVERAGES AND LOG 
;/ERRORS - STORE BEST PRESSURE IN PGOV 
:REPEAT SAHE FOR CHANNELS 3-5 
J STORE BEST PRESSURE IN POUT 
;CALCULATE PRESSURE DIFFERENTIAL ACROSS 
;/ORIFICE PLATE 
PAVE: 
PAVEl: 
PAVE2: 
PUSH H 
LXI H.OOOOH 
LXI D.OOOOH 
HVI C,08H 
CALL ADREAD 
HOV E,A 
DAD D 
DCR C 
JNZ PAVEl 
HVI C,03H 
HOV A,H 
RAR 
HOV H,A 
HOV A,L 
RAR 
HOV L,A 
DCR C 
m !^ AVE2 
{SAVE HL 
;ZERO HL 
;ZERO HL 
;READ CHANNEL 8 TIHES 
;READ CHANNEL 
;ADD VALUE TO PREVIOUS SUH 
5DECREHENT LOOP COUNTER 
;LOOP UNTIL DONE 8 TIHES 
;COUNT 3 SHIFTS RIGHT HL - DIVIDE BY 
-.REPEAT U«TIL 3 SHIFT OPERMIOKS 
- 211 -
ADREAD: 
NEOC: 
POP H 
RET 
NOV A,B 
OUT 05H 
OUT 03H 
IN 04H 
RLC 
JNC NEOC 
IN 03H 
SUI OFFSET 
RET 
;RESTORE HL 
iGET CHANNEL NO. 0-5 
;OUTPUT TO BUX 
;START CONVERSION A/D CONVERTOR 
;TEST EOC FLAG 
;LOOP TILL CONVERSION C0I1PLETE 
;READ A/D CONVERTOR 
{SUBTRACT CALIBRATION OFFSET 
AVRGE2: 
AVRGE3: 
SKIP: 
DIV3: 
XRA A 
KOV A,N 
NOV C,A 
OCX H 
ADD N 
RAR 
STAX D 
XRA A 
NOV A,C 
DCX H 
INX D 
ADD H 
RAR 
STAX D 
XRA A 
NOV A,H 
INX H 
INX D 
ADD N 
RAR 
STAX D 
RET 
PUSH B 
PUSH H 
PUSH D 
LXI D.OOOOH 
XRA A 
NOV A,« 
INX H 
ADD H 
NOV E,A 
JNC SKIP 
INR D 
XRA A 
INX H 
NOV A,H 
ADD E 
NOV E,A 
JNC DIV3 
INR D 
LXI H.OOOOH 
NVI C,04H 
CLEAR CARRY 
GET CHANNEL C 
SAVE IN C REG. 
POINT TO CHANNEL B 
ADD CHANNELS B+C 
DIVIDE BY 2 
STORE (B+C)/2 
CLEAR CARRY 
GET CHANNEL C 
POINT TO CHANNEL A 
INCREHENT TABLE POINTER 
ADD CHANNELS A+C 
DIVIDE BY 2 
STORE (A+C)/2 
CLEAR CARRY 
GET CHANNEL A 
POINT TO CHANNEL B 
INCREHENT TABLE POINTER 
ADD CHANNELS M 
DIVIDE BY 2 
STORE (A+B)/2 
;SAVE REGISTERS 
; INITIALISE SUN 
;CLEAR CARRY 
;GET IST VALUE 
;ADD 2N0 VALUE 
;SAVE IN E REG. 
;INCREHENT D IF CARRY 
JCLEAR CARRY 
;ADD IN 3RD VALUE 
;SAVE IN E REG. 
;INCREHENT D IF CARRY ELSE SKIP 
iINITIALISE RESULT 
jSHIFT AND ADD 4 TINES EQUALS DIVIDE BY 3 
- 212 
DIV3A: CALL SHIFT 
CALL SHIFT 
DAD D 
DCR C 
JNZ DIV3A 
NOV A,L 
POP D 
POP H 
POP B 
RET 
;SHIFT DE RIGHT ONE PLACE 
;AND AGAIN 
;ADD TO PARTIAL RESULT 
5DECREMENT SHIFT COUNTER 
I PUT RESULT IN A REG. 
SHIFT: NOV A,D 
RAR 
NOV D,A 
NOV A,E 
RAR 
NOV E,A 
RET 
;SHIFT D RIGHT THRO CARRY 
;SHIFT E RIGHT LINKING DE SHIFT VIA CARRY 
CHECK: 
CHECK3: 
CHECK2: 
CHECK1: 
NOD: 
HVI B,03H 
CALL AVR6E3 
STA PTEHP 
CALL HOD 
JC CHECK1 
LDAX D 
STA PTENP 
PUSH H 
PUSH B 
LXI H,PRETRY 
DCR N 
JZ CHECK2 
LXI H.CHHSG 
CALL NSGE 
NOV A,B 
ADD C 
CALL NNOUT 
CALL TIHLOG 
NVI A.OIN 
STA PRETRY 
POP B 
POP H 
INX H 
INX D 
DCR B 
JNZ CHECK3 
LDA PTENP 
RET 
PUSH B 
LDAX. D 
SUB N 
JNC NODI 
LDAX D 
NOV C,A 
NOV A,n 
SUB C 
; CHANNEL COUNTER - 3 CHANNELS 
;CALCULATE AVERAGE OF 3 CHANNELS 
;SAVE RESULT IN CASE IT IS USED 
;IS H00(A-(6+C)/2) ETC. GTR. THAN ALLOWED 
;JUNP IF NOT 
;IF SO OVERNRITE PRESSURE 
;/tlITH AVERAGE OF OTHER 2 
jSAVE REGS 
;DECREHENT RETRY COUNTER 
;D0 NOT LOG ERROR IF RETRY EXHAUSTED 
iLOG ERROR 
;GET CHANNEL COUNT 
iADD OFFSET 
;L06 CHANNEL IN ERROR 
;AND TINE 
;SET RETRY COUNTER TO 1 . 
;RESTORE REGS. 
lINCRENENT TABLE COUNTERS 
J/TO NEXT CHANNEL 
iDECRENENT LOOP COUNTER 
;LOOP UNTIL ALL 3 CHANNELS DONE 
;RETURN MITH RESULT IN A REG. 
;SAVE BC 
;(AVERAGE OF OTHER 2) NINUS (OTHER VALUE) 
;E.G. (B+C)/2 - A 
;JUHP IF RESULT POSITIVE 
;OTHERNISE REVERSE SUBTRACTION ORDER 
213 -
HODl: CP I DELTA 5COHPARENITH PERHISSABLE DIFFERENCE 
POP B 
RET 
CHHSG: . DB OAH.'PRESSR ERROR CH '.OOH 
OFFSET EQU 10 
DELTA EQU 10 
i DSEG 
PTABLE: DS 12 
PTEHP: DS 1 
PGOV: DS 1 
POUT: DS 1 
PDELTA: DS 1 
PRETRY: DS 1 
END 
;1 INCH U.G. 
; i INCH H.G. 
- 214 -
**************************************************************** 
SAVES CURRENT PROGRAH STATUS OF ALL REGISTERS 
NANE CRBKGEN') 
CSEG 
PUBLIC RBKGEN.RBLOCK 
RBKGEN: DI 
OUT OOH 
SHLD RBLOCK+6 
XTHL 
SHLD RBLOCK+8 
XTHL 
LXI H,OOOOH 
DAD 5P 
SHLD RBLOCK+10 
LXI SP,RBLOCK+6 
PUSH D 
PUSH B 
PUSH PSN 
RIN 
ANI 02H 
JNZ RETURN 
OUT 02H 
RETURN: POP PS« 
LHLD RBLOCK+10 
SPHL 
LHLD RBLOCK+6 
EI 
RET 
NUST NOT INTERRUPT RECOVERY BLOCK 
DISABLE RST/HOLD - QUICKER EXECUTION 
SAVE HL 
SAVE PC RETURN ADDRESS 
RESTORE STACK TOP 
SAVE SP 
CHANGE SP TO RECOVERY BLOCK ADDRESS 
SAVE DE 
SAVE BC 
SAVE PSN 
IF RST6.5 DISABLED LEAVE RST/HOLD DISABLED 
ENABLE RST/KOLD 
RESTORE PSN 
RESTORE SP 
: RESTORE HL 
RBLOCK: 
DSEG 
DS 12 
END 
;RESERVE 12 BYTES 
- 215 -
NAHE ('BLOCK') 
CSEG 
EXT RBL0CK,TIHL0G,HS6E 
PUBLIC BLOCK 
BLOCK: LXI H.VNSG 
CALL HSGE 
CALL TIHLOG 
LXI SP.RBLOCK 
POP PS» 
PGP B 
POP D 
LNLD RBLOCKHO 
SPHL 
LHLD RBLOCK+8 
XTHL 
LHLD RBLOCK+6 
RET 
LOG VECTORED RECOVERY 
LOAD SP KITH RECOVERY BLOCK START ADDRESS 
RESTORE PSN 
RESTORE BC 
RESTORE DE 
.•RESTORE SP 
,'SAVE RETURN ADDRESS ON STACK TOP 
; RESTORE HL 
jJUHP TO RETURN ADDRESS VnSG: DB OAH, 'VECTORED RECOVERY' ,00H 
END 
- 216 -
^^ SEL F T EST OF 0 E 0^ DRIVERS - CAN BE EXPANDED TO TEST ENTIRE SYSTEH 
NAHE CSLFTST') 
CSEG 
EXT MSGEJINLOG 
PUBLIC SLFTST,TFLAG 
SLFTST: IN MH 
. ANI OFH 
CPI OFH 
JZ SLFTST 
CPI OOH 
JZ SLFl 
XRA A 
STA TFLAG 
RET 
SLFl: LDA TFLAG 
INR A 
RZ 
NVI A.OFFH 
STA TFLAG 
XRA A 
OUT 04H 
CALL DELAY 
nVI A,06H 
CALL ATOD 
CPI HIGH 
JC ERROR 
NVI A,07H 
CALL ATOD 
CPI HIGH 
JC ERROR 
HVI A.OFFH 
OUT 04H 
CALL DELAY 
HVI A,06H 
CALL ATOD 
CPI L0« 
JNC ERROR 
HVI A,07H 
CALL ATOD 
CPI LON 
JNC ERROR 
NVI A,OFH 
OUT 04H 
RET 
ERROR: 
ATOD: 
NEOC: 
LXI H.SOLHSG 
CALL HSGE 
CALL TIHLOG 
RET 
OUT 05H 
OUT 03H 
IN 04H 
RLC 
;READ HINUTES FROM CLOCK 
;SELECT LONER BITS 
;ILLEGAL READ? 
;IF SO TRY AGAIH 
jHINS = 0 ? 
; CLEAR SELF TEST FLAG AND RETURN 
;TEST FLAG FOR FFH 
;FF BECOHES 00 
;RETURN IF FLAG SET 
{OTHERWISE SET FLAG AND DO SELF TEST 
;TURN BOTH SOLENOIDS OFF 
;ALLON VOLTAGE TO SETTLE 
;READ SOLENOID VOLTAGE - CHANNEL 6 
jCONPARE NITH SATISFACTORY VALUE 
{ERROR IF TOO LON 
{CHECK OTHER SOLENOID 
{ERROR IF TOD LOW 
{TURN BOTH SOLENOIDS ON 
;ALLON VOLTAGE TO SETTLE 
(READ SOLENOID VOLTAGE - CHANNEL 6 
{COHPARE KITH SATISFACTORY VALUE 
{ERROR IF TOO HIGH 
{CHECK OTHER SOLENOID 
{ERROR IF TOO HIGH 
{HOLD PRESSURE VIA SOLENOIDS 
{LOB FAILURE 
{AND TINE 
•OUTPUT CHANNEL TO HUX 
{START CONVERSION 
;READ END OF CONVERSION FLAG 
- 217 -
JNC NEOC JLOOF "NTIL EOC 
IN OJH ;READ A TO D CONVERTER 
RET 
LXI H,0700H ;APPROX lOHS DELAY DELAY: 
DELAY 1: DCR L 
JNZ DELAYl 
DCR H 
JNZ DELAYl 
RET 
SOLRSG: DB OAH,'SOLENOID FAILURE',OOH 
HIGH EQU 8CH 
LOW EQU 61H 
DSEG 
TFLAG: DS 1 
END 
218 -
USES NO RAH AND THEREFORE NO SUBROUTINES 
....uf- / ' r r A T l ' \ 
SFAIL: HOV C,H 
NAHE ('SFAIL') 
CSEG 
PUBLIC SFAIL 
HOV C,H ;PRINT APPROPRIATE SOFT FAIL MESSAGE 
XRA A .CLEAR A 
ORA C ;TEST FOR NULL - END OF MESSAGE 
JZ SAFE ;HAKE PLANT SAFE/SHUT BONN 
SFCOUT: RIM ;TEST FOR PRINTER BUSY 
ANI BOH 
JNZ SFCOUT ;LOQP IF PRINTER BUSY 
nVI B,OBH ;PRINT CHARACTER IN C REG. - THIS ROUTINE 
HVI A.OCOH ;/IS A DIRECT COPY OF COUT 
SFCOl: SIH 
LXI D,0172H ;1200 BAUD DELAY 
SFC02: DCR E 
JNZ SFC02 
DCR D 
JNZ SFC02 
STC 
MOV A.C 
RAR 
HOV C,A 
RVI A,BOH 
RAR 
XR! BOH 
DCR B 
JNZ SFCOl 
INX H ;6ET NEXT CHARACTER IN MESSAGE 
JMP SFAIL 
SAFE: XRA A 
OUT 04H jTURN BOTH SOLENOIDS OFF - FAILS SAFE ADD EXTRA SHUT DONN -ROUTINES HERE 
HLT ;HALT CONTROLLER 
END 
- 219 
********************************************************************* 
SYSTEM CRASH - WATCHDOG TIMER CAUSES VECTORED RECOVERY AND LOGS 
TIME AND ADDRESS 
DESTROYS ALL RE6S. 
I * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 
NAHE CNTRAP'l 
CSEG 
EXT NSGE,NMOUT,BLOCK,BUFFST,BUFFRD,BUFFHR,COUT 
PUBLIC NTRAP.NRETRY NTRAP: 
MTRAPl: 
POP D 
LXI H.NRETRY 
DCR « 
JNZ HTRAPl 
HLT 
LXI H.BUFFST 
SHLD BUFFRD 
SHLD BUFFNR 
BVI C,00H 
CALL COUT 
LXI H,SDNSG 
CALL NSGE 
NOV A,D 
CALL MHOUT 
NOV A,E 
CALL NHOUT 
EI 
JNP BLOCK 
{GET CRASH ADDRESS OFF STACK 
{OECRENENT RETRY COUNTER 
{EXECUTE SOFT RECOVERY 
{NAIT FOR SYSTEM RESET IF COUNT EXHAUSTED 
(CLEAR OUTPUT BUFFER IN CASE CORRUPTED 
{SEND NULL TO CLEAR PRINTER 
(LOG CRASH 
{LOG HIGH ORDER ADDRESS 
(LOG LOH ORDER ADDRESS 
(ENABLE INTERRUPTS AFTER COUT 
(VECTORED RECOVERY 
HDMSG: DB OAH,'TRAP «DOG ADR=',OOH 
HRETRY: 
DSEG 
DS 1 
END 
(RETRY COUNTER 
- 220 -
;SNAKE - INVALID ADDRESS RANGE CAUSES EXECUTION OF RST7 - LOG ADDRESS, 
;TIME AND EXECUTE VECTORED RECOVERY 
;DESTROYS ALL REGISTERS 
NAHE ('SNAKE') 
CSEG 
SNAKE: 
EXT NM0UT,BL0CK,HS6E,BUFFST,BUFFRD,BUFFNR 
PUBLIC SNAKE 
POP D 
LXI H.BUFFST 
SHLD BUFFRD 
SHLD 6UFFMR 
LXI H,SNHSG 
CALL HSGE 
NOV A,D 
CALL NMOUT 
HOV A,E 
CALL NMOUT 
JHP BLOCK 
;GET ERROR ADDRESS FROH STACK 
:CLEAR OUTPUT BUFFER IN CASE CORRUPTED 
{LOG ERROR MESSAGE 
;HI6H ORDER ADDRESS 
;LON ORDER ADDRESS 
{VECTORED RECOVERY AND LOG TIME 
SNHSG: DB OAH,'SNAKE RST7 ADR=',OOH 
END 
- 221 
********************************************************************** 
SOFTWARE ERROR - LOG ADDRESS AND TINE AND EXECUTE VECTORED RECOVERY 
DESTROYS ALL REGS 
********************************************************************** 
NAME CDFAULT') 
CSEG 
EXT NMOUT,BLOCK,NSGE 
PUBLIC DFAULT 
DFAULT: POP D {GET ERROR ADDRESS OFF STACK 
LXI HjSOFMSG {LOG ERROR 
CALL MSGE • 
NOV A,D (LOG HIGH ORDER ADDRESS 
CALL NNOUT 
NOV A,E 
CALL HMOUT 
JMP BLOCK (VECTORED RECOVERY AND LOG TINE 
a 
SOFHSG: DB OAH, 'SOFT ERROR ADR=',00H 
END 
- 222 -
• Htm*****************************************"********************** 
iINITIALISE REGISTERS AND VARIABLES 
{•»*«•«»«•*•••••••••••••«•*"•••••*»•»••••****•**"""******"****** 
NAHE CINITL') 
CSEG 
INITL: 
EXT NRETRY,SRETRY,PRETRY,TFLAG,COUT 
PUBLIC INITL,BUFFST,BUFFRD,BUFFNR,BUFEND 
LXI H,0000H 
LXI B.OOOOH 
LXI D,OOOOH 
PUSH PSN 
POP PSN 
HVI A,08H 
SIH 
IN OOH 
OUT OIH 
LXI H,BUFFST 
SHLD BUFFRD 
SHLD BUFFNR 
HVI A.OFFH 
STA SRETRY 
NVI A,OAH 
STA PRETRY 
NVI A,05H 
STA NRETRY 
XRA A 
STA TFLAG 
NVI C,OOH 
CALL COUT 
RET 
{SYNCHRONISE ALL REGISTERS 
(ENABLE ALL INTERRUPTS 
(RESET HENORY FLAGS LATCH 
(CLEAR CPU ERROR LATCH 
{INITIALISE OUTPUT BUFFER POINTERS 
INITIALISE RESYNC RETRY COUNTER 
INITIALISE PRESSURE RETRY COUNTER 
INITIALISE TRAP WATCHDOG RETRY COUNTER 
CLEAR SELF TEST FLAG 
SEND NULL TO CLEAR PRINTER 
BUFEND: 
BUFFST 
ON 4000H 
EQU 3F00H 
BUFFRD: 
BUFFWR: 
DSEG 
DS 2 
DS 2 
END 
- 223 
•iittiiiti**iitmiiHitHHHHitHHHttitiiiHiiHHiHiHH 
{OUTPUTS CHARACTER IN C REG. TO SERIAL DEVICE 
(DESTROYS BC,A 
•HtHtiiiiiiitiititHttiiiiiitiiitmHiHiiitiiiiiitttHtUt 
COUT: 
COl: 
C02: 
RETURN: 
NAHE ('COUT' 
CSEG 
PUBLIC COUT 
PUSH D 
DI 
OUT OOH 
HVI B,OBH 
HVI A,OCOH 
SIH 
LXI D,0172H 
DCR E 
JNZ C02 
DCR D 
JNZ C02 
STC 
MOV A.C 
RAR 
HOV C,A 
HVI A,80H 
RAR 
XRI BOH 
DCR B 
JNZ COl 
RIM 
ANI 02H 
JNZ RETURN 
OUT 02H 
POP D 
RET 
END 
{SAVE DE 
{DISABLE INTERRUPTS AND RST/HOLD 
{/COUT - SPEED SENSITIVE 
{NO. BITS TO SEND 
{SET START BIT AND SOD ENABLE 
{OUTPUT CARRY TO SOD PIN 
{1200 BAUD DELAY 
SET EVENTUAL STOP BITS 
ROTATE CHAR. RIGHT PUTTING NEXT 
DATA BIT INTO CARRY 
SET EVENTUAL SOD ENABLE BIT 
SHIFT DATA BIT IN CARRY TO SOD BIT 
INVERT SOD DATA BIT 
CHARACTER SENT ? 
NO,THEN SEND NEXT BIT-
IF RST 6.5 DISABLED THEN LEAVE 
/RST/HOLD DISABLED 
{RESTORE DE - CALLING ROUTINE 
{/HOST ENABLE INTERRUPTS IF REQUIRED 
- 224 -
'******************************************************************** 
(PUTS CHARACTER IN C REG. INTO OUTPUT BUFFER 
(DESTROYS A 
I******************************************************************** 
NAHE CCOUTBF') 
CSEG 
PUBLIC COUTBF 
EXT BUFFST,BUFEND,BUFFRD,BUFFWR 
COUTBF: 
FULTST: 
WRITE: 
RETURN: 
PUSH K 
LHLD BUFFWR 
INX H 
LDA BUFEND 
SUB L 
JNZ FULTST 
LDA BUFEND+1 
SBB H 
JNZ FULTST 
LXI H,BUFFST 
LDA BUFFRD 
SUB L 
JNZ WRITE 
LDA BUFFRD+1 
SBB H 
JZ RETURN 
SHLD BUFFWR 
NOV H,C 
POP H 
RET 
END 
(SAVE HL REG. 
(GET BUFFER WRITE POINTER 
(INCRENENT POINTER 
(TEST FOR END OF BUFFER 
(RESET POINTER TO START OF BUFFER 
(TEST FOR BUFFER FULL 
(SAVE BUFFER POINTER 
(SAVE CHARACTER IN BUFFER 
(RESTORE KL REG. 
- 225 -
{PRINT A CHARACTER FROM BUFFER UNLESS EMPTY 
{f«4f»f»f«lff«««««f<««4«*tf«4«t«t««>«>«f9«f«<««t*»f«<t«««ffff«<t>ff«« 
NAHE CPRBUFF') 
CSEG 
EXT BUFFRD,BUFFNR,BUFEND,BUFFST,COUT 
PUBLIC PRBUFF 
PRBUFF: RIH {READ SOD LINE 
{TEST HSB - PRINTER BUSY LINE 
{RETURN IF BUSY 
{TEST IF BUFFER EMPTY 
READ: 
READl: 
RIM 
ANI BOH 
RNZ 
LHLD BUFFRD 
LDA BUFFMR 
SUB L 
JNZ READ 
LDA BUFFHR+1 
SBB H 
RZ 
m H 
LDA BUFEND 
SUB L 
JNZ READl 
LDA BUFEND41 
SBB H 
JNZ READl 
LXI H.BUFFST 
SHLD BUFFRD 
MOV C.H 
CALL COUT 
EI 
RET 
END 
{JUMP IF HOT EMPTY 
{RETURN IF BUFFER EMPTY 
{INCREMENT BUFFER POINTER 
{TEST FOR END OF BUFFER 
{JUMP IF NOT 
{JUMP IF NOT 
{OTHERNISE SKIP TO BEGINNING OF BUFFER 
{STORE NEM BUFFER POINTER 
{READ A CHARACTER FROH BUFFER 
{PRINT IT 
{ENABLE INTERRUPTS DISABLED BY COUT 
- 226 
;PUTS A MESSAGE POINTED TO BY HL REG. INTO OUTPUT BUFFER 
;DESTROYS HL,A 
nSGE: 
CHAR: 
RETURN: 
NAHE CHSGE') 
CSE6 
PUBLIC HSGE 
EXT COUTBF 
PUSH B 
NOV C,M 
KRA A 
ORA C 
JZ RETURN 
CALL COUTBF 
INX H 
JHP CHAR 
POP B 
RET 
END 
jSAVE BC 
;6ET A CHARACTER 
;RETURN IF NULL 
;SEND CHARACTER 
;GET mi CHARACTER 
;RESTORE BC 
- 227 -
#***»«*»***»•«**»*»**»»••**••*•»#••••••••«•»»«•*«••»••»••• 
SEHDS NUMBER IN A REG. TO OUTPUT BUFFER flS 2 HEX DIGITS 
DESTROYS fl 
*#•*••**»**•«•»«**••*«**#**•*»«**•«•«••»«"•*«*•*»•»•»•«• 
NHOUT: 
NflHE (•NHOUT') 
CSE6 
PUBLIC HHOUT 
EXT COUTBF 
PUSH B ;SAVE BC 
PUSH PSH ;SAVE AR6UHENT 
RRC ;GET UPPER 4 BITS 
RRC 
RRC 
RRC 
ANI OFH ;SELECT LOB 
CALL CNVRT ;CONVERT TO ASCII 
NOV C,A 
CALL COUTBF ;SEND IT 
POP PSN ;RESTORE ARGUMENT 
AHI OFH ;6ET LOHER BITS 
CALL CNVRT ;CONVERT TO ASCII 
nov C,A 
CALL COUTBF ;SEND IT 
POP B ;RESTORE BC 
RET 
CNVRT: ORI 30H 
CPI 3AH 
RC 
ADI 07H 
RET 
ADD OFFSET FOR 0-9 
TEST IF A-F 
RETURN IF NOT 
OTHERHISE ADD FURTHER OFFSET OF 7 
END 
- 228 -
TIHLOG: 
TSAVE: 
AGAIN: 
DBLOUT: 
RSGl: 
HSG2: 
TABLE: 
STORE DATE AND TINE IN TABLE 
10'S MONTHS 
MONTHS 
10'S DAYS 
DAYS 
10'S HOURS 
HOURS 
lO'S NINS. 
HINS. 
5 DATE 
;DAY 
LOGS DATE AND TIME 
DESTROYS A,HL,C 
*••#•#•##*••«•««**#«*»*•*•»»*•«**•*#*•**••••»*•#*§•»*»»•*•*••»«•»#•» 
NAME CTIHLOG') 
CSEG 
E U HSGE,COUTBF 
PUBLIC TIML06 
LXI H,TABLE 
IN ICH 
CALL TSAVE 
IN IBH 
CALL TSAVE 
IN 19H 
CALL TSAVE 
IN 18H 
CALL TSAVE 
IN 17H 
CALL TSAVE 
IN 16H 
CALL TSAVE 
IN ISH 
CALL TSAVE 
IN 14H 
CALL TSAVE 
LXI H.HSGl 
CALL MS6E 
LXI H,TABLE+2 
CALL DBLOUT 
MVI C,':' 
CALL COUTBF 
in H,TABLE 
CALL DBLOUT 
LXI H,MS62 
CALL MSGE 
LKI H,TABLE+4 
CALL DBLOUT 
HVI C,':' 
CALL COUTBF 
CALL DBLOUT 
MVI C,OAH 
CALL COUTBF 
RET 
AN I OFH 
CPI OFH 
JZ AGAIN 
ORI 30H 
MOV M,A 
INK H 
RET 
POP H 
JHP TIMLOG 
MOV C,H 
CALL COUTBF 
INK H 
MOV C,M 
CALL COUTBF 
INK H 
RET 
;MONTH 
;TIHE 
5 MASK OFF 4 LaB 
5HAS REGISTER BEEN UPDATED? 
; RETRY 
;CONVERT TO ASCII 
;STORE IN TABLE 
{INCREMENT TABLE POINTER 
; DUMMY TO RESTORE STACK 
;READ ALL REGISTERS 
;SEND 2 CHARACTERS TO BUFFER 
DB 0AH,-DATE ',00H 
DB ' HR ',00H 
DSE6 
Ds e 
END 
