Self checking circuits applied to the fault diagnosis of microprocessor based systems by Hollis, T. J.
        
University of Bath
PHD








Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
            • Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
            • You may not further distribute the material or use it for any profit-making activity or commercial gain
            • You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Download date: 13. May. 2019
SELF CHECKING CIRCUITS APPLIED TO THE FAULT
DIAGNOSIS OF MICROPROCESSOR BASED SYSTEMS
submitted by T. J. Hollis 
for the degree of PhD 
of the University of Bath 
19B6
COPYRIGHT
Attention is drawn to the fact that copyright of this 
thesis rests with its author. This copy of the thesis has 
been supplied on condition that anyone who consults it is 
understood to recognise that its copyright rests with its 
author and that no quotation from the thesis and no 
information derived from it may be published without the 
prior written consent of the author.
This thesis may be made available for consultation within 
the University Library and may be photocopied or lent to 
other libraries for the purposes of consultation.
UMI Number: U369651
All rights reserved
INFORMATION TO ALL USERS 
The quality of this reproduction is dependent upon the quality of the copy submitted.
In the unlikely event that the author did not send a complete manuscript 
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.
Dissertation Publishing
UMI U369651
Published by ProQuest LLC 2013. Copyright in the Dissertation held by the Author.
Microform Edition © ProQuest LLC.
All rights reserved. This work is protected against 
unauthorized copying under Title 17, United States Code.
ProQuest LLC 
789 East Eisenhower Parkway 
P.O. Box 1346 
Ann Arbor, Ml 48106-1346




This thesis examines the fault diagnosis of microprocessor 
based systems from the use of existing test techniques and 
equipment to the design and application of self checking 
circuits.
Initially, some existing techniques and equipment for the 
testing of microprocessor based systems are detailed, 
together with the test philosophies of twelve major 
manufacturers or users of such systems. This includes a 
specific application of in-circuit emulation.
Consideration of the testing process at the design stage 
has resulted in built in test at board and chip level. 
Common forms of built in test are studied, particularly 
error detection codes, and a summary provided of available 
integrated circuits with built in test.
The aspect of built in test where a system is tested 
concurrently with normal operation makes extensive use of 
fault detection or checking circuits. The problem of 
failures in these checking circuits is resolved with the 
use of self checking circuits.
The theory of self checking circuits is introduced with 
formal definitions of their characteristics. Based on the 
tests required to detect all single failures in logic 
gates and the use of Karnaugh maps, a technique is 
proposed for the design of totally self checking circuits. 
Circuit designs are presented for several self checking 
code checkers.
The mechanisms required to construct a self checking 
microprocessor based system are discussed. These allow 
a device or circuit failure in the system to be precisely 
located and include the use of signal isolation circuits. 
A review of proposals for self checking devices and 
systems reveals how extensively the various self checking 
mechanisms are adopted.
Finally, an experimental self checking computer is 
described, in which the application of self checking 
circuits to the fault diagnosis of microprocessor based 
systems is practically evaluated.
ACKNOWLEDGEMENTS
At the University of Bath, the author wishes to thank Mr.
A. R. Daniels (Senior Lecturer in the School of Electrical 
Engineering) for his guidance as Supervisor during the
research and for the provision of facilities within the
School of Electrical Engineering. The guidance given Dr.
S. L. Hurst (Senior Lecturer in the School of Electrical 
Engineering) is also gratefully acknowledged. In 
addition, the author appreciates the assistance given by 
Dr. R. T. Lipczynski (Lecturer in the School of Electrical 
Engineering) during the industrial survey.
The provision of a Case Award by the Science and Engineer­
ing Research Council, Swindon, is also acknowledged.
At Rolls Royce pic, Bristol, the author wishes to thank 
Mr. R. D. Fyfe (ADR Applications Manager) and Mr. A. C. 
Plenty (Head of Electronics and Instrumentation) for the 
provision of time and facilities during the past three 
years, as well as Mr. P. L. Westlake (Systems Planner) for 
many helpful discussions and Mr. M. S. Whitfield (ADR3 
Project Manager : Systems Procurement) as industrial
supervisor during the first three years.
The assistance given by Ms. H. M. R. Hollis (Architectural 
Designer at the Welsh School of Architecture, Cardiff) and 
Mr. A. Burn (Senior Systems Engineer at Rolls Royce pic, 
Bristol) in the preparation of figures for this thesis is 
gratefully acknowledged, as is the assistance given by Mr.
P. D. Burn (Sales Engineer at Multibloc Sales Ltd.,
Keynsham, Bristol) in the wiring of the experimental 
computer.
During the industrial survey, the co-operation of the 
following companies is noted; Lucas Aerospace Ltd., Hemel 
Hempstead; Satchwell Control Systems Ltd., Slough; British 
Aerospace, Dynamics Group, Stevenage; Renishaw Electrical 
Ltd., Wotton-Under-Edge; Marconi Avionics Ltd., Micro­
processor Division, Milton Keynes; Digital Equipment 
Company Ltd., Bristol; Ferranti Computing Systems Ltd., 
Bracknell; Data General, Bristol; Cifer Systems Ltd., 
Melksham; Negretti and Zambra Avionics Ltd., Eastleigh; 
and Square 'D' Ltd., Swindon.
Finally, the author is indebted to his parents, Mr. and 





Asynchronous Communications Interface Adaptor 
Arithmetic Logic Unit
American Standard Code for Information 
Interchange
Automatic Test Equipment 
Buffer
bits per second 




Complementary Metal-Oxide Semiconductor 
Central Processing Unit 
Hamming distance 
Diode or Data line 
Direct Current 
DED/DEC Double Error Detecting/Double Error Correcting
DIR Direction
DMA Direct Memory Access
EXOR Exclusive-OR
Fs secure fault set
Ft tested fault set
G enable
ECL Emitter-Coupled Logic
EPROM Erasable Programmable Read Only Memory










LED Light Emitting Diode
LFSR Linear Feedback Shift Register
LSI Large Scale Integration
M Mega (106 )
MOS Metal-Oxide Semiconductor
MSI Medium Scale Integration
MUX Multiplexer
N secure input set
PCI Programmable Communications Interface
PSC Partially Self Checking
PIA Peripheral Interface Adaptor























RAM Random Access Memory
ROM Read Orily Memory
RST Reset




SED/SEC Single Error Detecting/Single Error Correcting
SSI Small Scale Integration
STO Self Testing Only
SV Self Verification
TPG Test Pattern Generation
TR Transistor
TSC Totally Self Checking
TTL Transistor-Transistor Logic
U the Universal set
*jP microprocessor
V Volts
Vcc Positive supply rail
VDU Visual Display Unit (terminal)
VLSI Very Large Scale Integration
VMA Valid Memory Address
X normal input set
X^ vector X










does not equal 
approximately equals 
is equivalent to 
less than
less than or equal to 
greater than
greater than or equal to 
vector







is a member of
is not a member of
-v-
implies that 
3  is a superset of
C  is a subset of
C  is a subset of or equal to
V for all
^ for not all
| given that
3 there is some
—> to or maps to
<— from
£ } Set
0 clock phase or null set
n  rounded up to the nearest integer
LJ rounded down to the nearest integer
Morphic AND
A logical complement of A or A is active low





SYMBOLS AND ABBREVIATIONS iv
CHAPTER ONE : INTRODUCTION 1
1.1 : REFERENCES 6
1.2 : FIGURES 11
CHAPTER TWO : EXISTING TECHNIQUES AND EQUIPMENT 14
2.1 : SIGNATURE ANALYSIS 14
2.2 ; LOGIC ANALYSIS 16
2.3 : AUTOMATIC TEST EQUIPMENT 20
2.3.1 : In-Circuit Testing 20
2.3.2 s Functional Testing 21
2.3.3 : In-Circuit and Functional Testing Combined 22
2.3.4 s ATE In Use 23
2.4 : SELF TEST 24
2.5 : IN-CIRCUIT EMULATION 26
2.6 s COMBINED TECHNIQUES 29
2.7 : PROVISION FOR TEST 31
2.8 : REFERENCES 32
2.9 : FIGURES 34
CHAPTER THREE : BUILT IN TEST - AN INTRODUCTION 39
3.1 s FAULTS AND FAULT MODELS 39
3.1.1 s Stuck-At Fault Model 40
3.1.2 s LSI/VLSI Failures 42
3.1.3 : Bridging Faults (shorts) 43
3.1.4 : Unidirectional Faults 43
3.1.5 : Functional Fault Model 43
3.2 : OBSERVABILITY AND CONTROLLABILITY 44
3.3 s DESIGN FOR TESTABILITY 45
3.4 : REDUNDANCY 46
3.5 : REFERENCES 48
3.6 : FIGURES 51
CHAPTER FOUR : TECHNIQUES FOR BUILT IN TEST 54
4.1 : MODES OF TESTING 55
4.1.1 : On Line Testing 55
4.1.2 : Off Line Testing 55
4.2 : THE LINEAR FEEDBACK SHIFT REGISTER 56
4.3 : ERROR DETECTING CODES 57
-vii-
4.3.1 ; Single Parity Codes 58
4.3.2 : B-Adjacent Codes 59
4.3.3 : Duplication Codes 59
4.3.4 : Checksum Codes 60
4.3.5 : AN Codes 61
4.3.6 : Residue Codes 62
4.3.7 : M-out-of-n Codes 63
4.3.8 : Berger Codes 64
4.3.9 s Cyclic Codes 64
4.3.9.1 : Non Systematic Cyclic Codes 65
4.3.9.2 : Systematic Cyclic Codes 65
4.3.10 : Hamming Codes 66
4.4 : SCAN DESIGN 68
4.4.1 : Scan Path Design 69
4.4.2 : Level Sensitive Scan Design 70
4.4.3 : Advantages and Limitations of Scan Design 70
4.4.4 s STUMPS 71
4.5 : THE BUILT IN LOGIC BLOCK OBSERVER 72
4.6 : AUTONOMOUS TESTING 73
4.6.1 : Partitioning with Multiplexers 75
4.6.2 : Sensitised Partitioning 75
4.7 : SERIAL SHADOW REGISTER 76
4.8 s OTHER BIT TECHNIQUES 78
4.8.1 : Scan Set Logic 78
4.8.2 : Random Access Scan 78
4.8.3 : Syndrome Testing 78
4.8.4 : Walsh Coefficient Testing 78
4.8.5 : ROM Based Test Patterns and Responses 79
4.8.6 s Self Oscillation 79
4.8.7 : Self Comparison 79
4.8.8 : History Memory 79
4.8.9 : Reed Muller Canonical Representation 79
4.8.10 : Technique Comparisons 80
4.9 : INTEGRATED CIRCUITS WITH BIT 80
4.9.1 : Motorola MC6805 80
4.9.2 : Motorola M68000 81
4.9.3 : Intel iAPX 432 81
4.9.4 : Monolithic Memories 74S818 81
4.9.5 : Monolithic Memories 63DA1643/841/441 82
4.9.6 : National Semiconductor SLX6360 82
4.9.7 : National Semiconductor DP8400 82
4.9.8 : LSI Logic LSA2000 82
4.9.9 : Hitachi Gate Arrays 83
4.10 : SELF VERIFICATION 83
4.11 : CONCLUSIONS 84
4.12 : REFERENCES 86
4.13 : FIGURES 93
CHAPTER FIVE : SELF CHECKING CIRCUITS
- AN INTRODUCTION
-viii-
5.1 : INTRODUCTION 113
5.2 : DEFINITION OF TERMS 114
5.3 : TOTALLY SELF CHECKING CIRCUITS AND NETWORKS 119
5.3.1 s TSC Circuits 119
5.3.2 s TSC Networks 121
5.4 : TOTALLY SELF CHECKING CHECKERS 124
5.4.1 : General Structure 124
5.4.2 : TSC Checker for Separable Codes 126
5.5 s PARTIALLY SELF CHECKING NETWORKS 127
5.5.1 : Type 1 Networks 127
5.5.2 : Type 2 Networks 130
5.5.3 : Type 3 Networks 130
5.5.4 : A Partially Self Checking Logic Unit 132
5.6 s SELF TESTING ONLY CIRCUITS AND NETWORKS 133
1365.7 s SELF CHECKING PROPERTIES OF BIT AND BYTE-SLICED CIRCUITS
5.7.1 : Bit-Sliced Circuits 136
5.7.2 : Byte-Sliced Circuits 138
5.8 : SELF CHECKING SEQUENTIAL CIRCUITS 139
5.8.1 : Fault Security 139
5.8.2 : Self Test 142
5.9 s LITERATURE REVIEW 142
5.10 s REFERENCES 145
5.11 : FIGURES 149
CHAPTER SIX : THEORY AND DESIGN OF SELF
CHECKING CIRCUITS 165
6.1 s INTRODUCTION 165
6.2 s TESTING FOR FAILURES IN AND, , „
OR AND INVERTER GATES
6.3 : TESTING FOR FAILURES IN EXCLUSIVE-OR GATES 168
6.4 : TESTING FOR FAILURES IN CASCADED GATES 169
6.5 : DESIGN OF TSC CHECKERS USING KARNAUGH MAPS 174
6.5.1 : TSC AND/OR Structures 175
6.5.2 : TSC OR/AND Structures 177
6.6 : A TSC N-BIT COMPARATOR 178
6.7 : A TSC PARITY CODE CHECKER 184
6.8 : A TSC 1-OUT-OF-N CODE CHECKER 186
6.9 : A TSC PERIODIC SIGNAL CHECKER 190
6.10 : LITERATURE REVIEW 191
6.11 : REFERENCES 194
6.12 : FIGURES 198
-ix-
CHAPTER SEVEN s FAULT DETECTION AND DIAGNOSIS
IN SELF CHECKING SYSTEMS
7.1 : INTRODUCTION
7.2 : POSITIONING OF CHECKERS
7.3 : DATA PATHS AND MEMORY
7.3.1 : Choice of Error Detection Codes
7.3.2 : Selection and Addressing Errors
7.3.3 : Code Translation
7.3.4 : Interleaved Coding
7.4 s CONTROL PATHS
7.4.1 : Clocks
7.4.2 s Control Lines
7.4.3 : Decoders
7.5 j ARITHMETIC OPERATIONS
7.6 : LOGICAL OPERATIONS
7.7 : HARDWARE CHECKS FOR SOFTWARE FAULTS
7.8 : FAULT DIAGNOSIS
7.8.1 : Fault Indication
7.8.2 s Fault Isolation
7.8.3 s Transient Error Indications
7.8.4 : Verification and Efficiency of
Fault Detection Circuits
7.8.5 : Marginal Testing
7.9 : SUMMARY AND CONCLUSIONS
7.10 : REFERENCES
7.11 : FIGURES
CHAPTER EIGHT ; SELF CHECKING SYSTEMS AND
DEVICES : A REVIEW
8.1 : INTRODUCTION
8.2 : SELF CHECKING INTEGRATED CIRCUITS
8.3 : PROPOSED SELF CHECKING SYSTEMS AND DEVICES
8.3.1 68000 Microprocessor
8.3.2 MIL-STD-1750A Microprocessor
8.3.3 Microprogram Control Unit
8.3.4 Four-Bit Microprocessor
8.3.5 Error Detection Processor
8.3.6 The PAD
8.3.7 Generalised VLSI Chip
8.3.8 Morphic AND Gate
8.3.9 VLSI Building Blocks
8.3.10 Research Microcomputer




CHAPTER NINE : AN EXPERIMENTAL SELF CHECKING COMPUTER 286
9.1 : INTRODUCTION 286
9.2 : THE UNCHECKED SYSTEM 286
9.3 : THE CHECKED SYSTEM 289
9.3.1 : CPU 289
9.3.2 : System and CPU Clocks 291
9.3.3 : Reset Line 292
9.3.4 : Data Transmission Paths 292
9.3.5 : Memory 295
9.3.6 : Address Decoding 297
9.3.7 s I/O Devices 298
9.3.8 : Control Logic 299
9.3.9 : Isolators 300
9.4 : FAULT INDICATION AND ERROR CONTROL LOGIC 301
3039.5 : PRACTICAL IMPLEMENTATION OF TOTALLY SELF CHECKING COMPARATORS
9.6 : TESTING THE EXPERIMENTAL COMPUTER 303
9.7 : REFERENCES 308
9.8 : FIGURES 309
CHAPTER TEN : CONCLUSIONS AND FURTHER WORK 327
APPENDIX A s PRACTICAL IN-CIRCUIT EMULATION 334
A. 1 : OPERATION OF THE PROCESSOR SYSTEM 334
A.2 : THE TESTS * 334
A.3 : THE TESTS IN USE 337
A.4 : REFERENCES 338
A.5 : FIGURES 339
APPENDIX B : SIGNAL ISOLATION CIRCUITS 341
B.l : INTRODUCTION 341
B.2 : SWITCHED UNIDIRECTIONAL ISOLATORS 342
B.3 : BIDIRECTIONAL ISOLATORS 343
B.4 : UNSWITCHED UNIDIRECTIONAL ISOLATORS 344
B.5 : REFERENCES 347
B .6 : FIGURES 349
-xi-
CHAPTER ONE : INTRODUCTION
A continuing increase in the complexity of integrated 
circuits, namely Very Large Scale Integration (VLSI), has 
meant more logic power per square millimetre of circuit 
area.
The introduction of the microprocessor has led to a major 
advance in product design, with a reduction in the design 
time and a realisation of products which were previously 
inconceivable.
However, this increased density of integration has result­
ed in internal circuits or modules becoming transparent to 
the outside world, which means that they cannot be direct­
ly controlled or observed. This is certainly true of 
microprocessors, which have the added complexity of a bus 
structure, so that it is necessary to observe the activity 
on the address data and control buses simultaneously to 
determine its exact operation at any time.
These factors have contributed to the creation of a 
serious challenge to the electronics industry; namely how 
to service processor boards and the products which use 
them. Testing and troubleshooting using conventional 
equipment is a difficult and time consuming task, which 
requires skilled personnel, of which there is generally a 
shortage, since most of them are actively involved with 
design.
In order to combat this problem, the industry has been 
searching for new tools, techniques and equipment which 
can be used in the field, at a central or local repair 
base and also on the production line, by people who are 
not necessarily familiar with the operation and structure 
of such systems. Design and debugging during development 
also benefit from these techniques.
Some general constraints for test techniques or equipment
-1-
are:
1) An ability to work with existing and future process­
ors .
2) Provision for operating with various logic types.
3) Facilities for testing the analogue sections of a 
system.
4) Speed for real time measurement.
5) Ease of installation into the unit under test.
Expanding these points:
1) In order not to become rapidly outdated with the 
introduction of new processors, the equipment must be 
able to work with a wide variety of processors, sup­
porting eight and sixteen bit architectures, and be 
able to cater for new ones as they become available, 
notably those with thirty-two bit architectures. This 
is generally achieved by using a dedicated probe for 
each processor and sufficient flexibility in the main 
instrument to cope with present and future needs.
2) The use of a wide variety of technologies, notably 
within interface circuits, has resulted in a require­
ment to operate with different voltage levels, whether 
these be those of TTL, MOS or ECL.
3) Since analogue/digital interfaces are crucial to the
diversity of processor based systems, it is an advan­
tage if analogue testing can be handled as well. Much 
analogue testing now relies on the digitisation of its 
signals for subsequent processing and display.
4) The speed at which the test equipment operates should
be higher than that of the system it is testing. This
means that the stimulation and observation of repon­
ses can be carried out in real time. Problems due to 




5) A lot of test equipment has been designed around the 
IEEE488 bus Cl.13, which allows a rapid connection of 
the instrument to the unit under test, a defined 
communication link, as well as compatibility with 
other instrumentation.
Much has been written on current techniques and equipment, 
but to get a practical viewpoint, a survey of twelve 
major manufacturers and/or users of microprocesser based 
systems was instigated to establish their particular test 
philosophies. The various strategies adopted by these 
companies are detailed in Chapter 2 from the viewpoint of:
1) The benefits they offer.
2) Limitations to their performance.
3) The extent of their usage.
Considerable personal practical experience, obtained over 
a number of years, is also incoporated into the appropri­
ate section. In addition, Appendix A describes a specific 
application of in-circuit emulation to a microprocessor 
based system.
Most of the techniques in extensive current use require 
some form of external equipment to perform the testing. 
With technology progressing so rapidly, this equipment 
soon becomes inflexible or is simply unable to cope. 
Testing can become a costly process if the equipment has 
to be replaced.
During the past twenty years, considerable thought has 
been given to catering for the test process at the design 
stage and not as an after-thought when the design, manu­
facture and installation phases of a product have been 
completed. This is Built-in-Test (BIT), which is intro­
duced in Chapter 3 with a discussion of faults and fault 
models, observability and controllability, design for
testability and redundancy. These four concepts are used
✓
-3-
to describe and specify BIT. In Chapter 4 the more 
common forms of BIT are studied, again considering the 
benefits and limitations of each technique. The use of 
error detecting codes, scan design, built in logic block 
observers, autonomous testing, serial shadow registers and 
design for self verification are included in this study.
Figure 1.1 summarises the possible test strategies for 
VLSI (equipment, techniques and built-in hardware or soft­
ware) and details many references for further information.
An integral part of many of the BIT strategies discussed 
in Chapter 4 is an ability to check a microprocessor based 
system whilst it is on line carrying out its normal 
functions. However, what happens if one or more of the 
checking circuits fail? Is there a checker checking the 
checker? The solution is to use self checking circuits, 
circuits which effectively check themselves. The rest of 
the thesis is devoted to this subject.
In Chapter 5 self checking circuits are introduced with 
formal definitions of their characteristics. This in­
cludes the concepts of self testing only (STO) circuits, 
partially self checking (PSC) circuits and totally self 
checking (TSC) circuits. Self checking (SC) networks, SC 
checkers, SC bit-sliced circuits, SC byte-sliced circuits 
and SC sequential circuits are also discussed.
Chapter 6 investigates the theory and design of SC cir- 
uits. The tests which have to be performed on individual 
logic gates to detect failures within them are significant 
in determining if a circuit is SC or not, so these tests 
are considered in some detail. Based on the resulting 
test requirements and the use of Karnaugh maps, a 
technique is then proposed for the design of SC circuits. 
Subsequent sections in Chapter 6 describe the design of 
TSC comparators and periodic signal checkers, as well as 
TSC 1-out-of-n and parity code checkers.
-4-
Since this thesis is concerned with the testing of 
microprocessor based systems, Chapter 7 discusses the 
requirements to make such systems self checking. The main 
objective here is a means of precisely locating a faulty 
device or circuit during normal operation. Various 
techniques are presented for fault detection and fault 
diagnosis. Chapter 8 then examines the extent to which 
these techniques are employed in current and past research 
by reviewing a number of self checking devices and 
systems.
In order to support the philosphies of Chapter 7 and the 
conclusions from Chapter 8, a minimal self checking micro­
processor based system has been built using the designs of 
Chapter 6. This is described in Chapter 9 together with 
results from its subsequent evaluation. An essential 
requirement for the construction of this system was a 
means of fault isolation between the transmitting and 
receiving points of an interconnection; notably the 
address, data and control buses. This design work for 
unidirectional and bidirectional'isolators is detailed in 
Appendix B.
Finally, overall conclusions on the design and use of SC 
circuits in microprocessor systems are presented, as well 
as suggestions for further work.
-5-
1.1 : REFERENCES
1.1) IEEE-488 instruments - A. Santoni; EDN; October 28, 
1981; pp. 77-90.
1.2) How to troubleshoot and repair microcomputers - J. 
D. Lenk; Reston Publishing Inc.; Virginia, USA; 
1980; pp. 47-61.
1.3) Troubleshooting microprocessors and digital logic 
- R. L. Goodman; Tab Books Inc.; PA., USA; 1980; 
Chapter 5.
1.4) Practical troubleshooting techniques for micropro­
cessor systems - J. W. Coffron; Prentice Hall Inc.; 
New Jersey, USA; 1981; pp. 134-7.
1.5) as per 1.3); Chapter 4.
1.6) Digital testing oscilloscope cushions against
mounting costs of troubleshooting - W. E. Shoe­
maker, D. W. Wilson; Electroncs; March 16, 1978;
pp. 105-112.
1.7) ICE development heats up to handle new processors
D. G. West, R. Paulson; Electronic Design; 
September 3, 1981; pp. 125-130.
1.8) Fast emulator debugs 8085-based microcomputers in
real time - M. Y. Yen; Electronics; July 21, 1977;
pp. 108-112.
1.9) Hardware emulation conquers the testing mountain
created by 16 bit uPs - D. Lemke, D. Smith, T. 
Tunder; Electronic Design;No. 14, July 5, 1979;
pp. 54-59.
1.10) Cut hardware, software development costs - take
advantage of in-circuit emulation - J. M. Kelley; 
Electronic Design; No. 17, August 16, 1979; pp.
66-71.
1.11) Use of the slave emulator - T. McCann; Micropro­
cessors & Microsystems; Vol. 5, No.2, March 1981; 
pp. 65-68.
1.12) ROM simulator, memory tracer debug microsystems
J. L. Nichols, R. W. Ulrickson; Electronic 
Design; July 23, 1981; pp. 147-154.
1.13) as per 1.4); Chapters 3, 4, 5 and 9.
1.14) Emulator solves system-level debug problems - R. 
Parks, K. Greenberg; Electronic Design; May 14, 
1981; pp. 173-180.
1.15) Fault diagnostics for microprocessor based systems 




















B.Sc. in Electrical Engineering; University of 
Bath, England; 1980; p. 16.
as per 1.2); pp. 273-284.
as per 1.4); Chapter 7.
Instruments - A. Santoni; EDN; July 20,1979; p. 97. 
as per 1.15); pp. 13-15.
Signature analysis wins new acclaims - M. Marshall; 
Electronics;Fehruary 14, 1980; pp. 103-104.
Applying signature analysis to existing processor- 
based products - R. Rhodes Burke; Electronics; 
February 24, 1981; pp. 127-133.
as per 1.2); pp. 63-105.
Logic-analyzer 'computer' simplifies computer test­
ing - G. Brock; Electronic Design; June 25, 1981;
pp. 101-108.
What to look for in logic timing analysers - M. 
Marshall; Electronics; March 29, 1979; pp. 109-114.
Designing a servicemans' needs into microprocessor- 
based systems - M. Neil, R. Goodner; Electronics; 
March 1, 1979; pp. 122-128.
as per 1.4); Chapter 6.
Flexible pattern generation aids logic analysis
- S. Palmquist, M. Pettet; Electronic Design; 
October 15, 1981; pp. 165-174.
Functional and in-circuit testing team up to tackle 
VLSI in the '80s - P. Hansen; Electronics; April 
21, 1981; pp. 189-195.
Application of guided probe and in-circuit test 
architectures to compensate for inadequate UUT 
testability - A. Lowenstein; Autotestcon '80*;
pp. 60-66.
Optimised digital testing using general purpose ATE
- P. J. Hand; Autotestcon '80*; pp. 259-263.
ATE swings towards merged in-circuit, functional 
tests - J. McLeod; Electronic Design; October 29, 
1981; pp. 90-104.
To sort out card testers, ask the right questions
- S. L. Black; Electronic Design; February 5, 1981; 
pp. 115-119.
Automating test generation closes the design loop
-7-
- R. Hickling, G. Case; Electronics; November 30, 
1981; p.129-133.
1.34) Test LSI boards functionally on an in-circuit 
tester - T. Jackson, P. Vais; Electronic Design; 
October 29, 1981; pp. 137-144.
1.35) Design for in-situ chip testing with a compact 
tester - C. C. Perkins, S. Sangani, R. Stopper, M. 
Valitski; 1980 Test Conf.*; pp. 29-41.
1.36) Software package generates in-circuit programs 
automatically - C. Bostrom; Electronic Design; 
February 5, 1981; pp. 99-102.
1.37) In-circuit tester takes on ECL, TTL, and MOS 
devices - M. P. Carroll, G.W. Peterson; Electronic 
Design; May 28, 1981; pp. 91-97.
1.38) Test vectors development and optimisation for a 
microprocessor - C. C. Timoc, L. M. Hess, F. R. 
Stott; Autotestcon '80*; pp. 165-170.
1.39) Interactive design simplifies test-program gen­
eration - B. Kelley, M. Bonham, R. Chruscial; 
Electronic Design; February 19, 1981; pp. 169-173.
1.40) Microprocessor functional testing - R. Chantal, S. 
Garbriele; 1980 Test Conf.*; pp. 433-443.
1.41) Enhanced simulator takes -on bus structured logic
- H. Levin; Electronic Design; October 29,1981; pp. 
153-157.
1.42) as per 1.5); Chapter 10.
1.43) Reliability Design Handbook - R. T. Anderson; IIT 
Research Institute (IITRI), Chicago, for Reliabil­
ity Analysis Centre; 1975; pp. 258-271.
1.44) as per 1.4); Chapter 8.
1.45) Testing logic networks and designing for testabil­
ity - T. W. Williams, K. P. Parker; Computer; 
October 1979; pp. 9-21.
1.46) Design for autonomous test - M. J. McCluskey, S. 
Bozorgui-Nesbat; 1980 Test Conf.*; pp. 15-21.
1.47) Microprocessor bus standard could cure designers' 
woes - G. Force; Electronics; July 20, 1978; pp. 
113-118.
1.48) A monolithic self-checking error detection process­
or - J. Chavade, M. Vergniault, P. Rousseau, Y. 




















Quality and control self-test - R. W. Spearman, F. 
D. Patch; 1980 Test Conf.*; pp. 257-260.
Implementation techniques for self-verification 
- R. M. Sedmak; 1980 Test Conf.*; pp. 267-278.
BIDC0, Built-in digital circuit observer - P. P. 
Fasang; 1980 Test Conf.*; pp. 261-266.
Enhance computer fault isolation with a history 
memory - G. L. Fitzgerald; Autotestcon '80*; pp. 
267-278.
Application of shift register and its effective 
implementation - M. Kawai, S. Funatsu, A. Yamada; 
1980 Test Conf.*; pp. 22-25.
Incomplete scan path with an automatic test gen­
eration methodology - E. Trischler; 1980 Test 
Conf.*; pp. 153-162.
Applying the Hamming code to microprocessor-based 
systems - E. L. Wall; Electronics; November 22,
1979; pp. 103-110.
A research oriented microcomputer with built in 
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
Self testing computers - J. B. Clary, R. A. Sacane;
Computer; October 1979; pp. 49-59.
as per 1.43); pp. 185-213.
Analysis of fault detection coverage of a self-test 
software program - V. Tasar; FTCS-8*; pp. 65-71.
Design self-testing capability for reliable uC-
system operation - S. Strom; EDN; October 28, 1981;
pp.102-108.
Efficient and effective uC testing requires care­
ful preplanning - E. S. Donn, M. D. Lippman; EDN; 
February 20, 1979; pp.97-107.
Design forethought promotes easier testing of 
microcomputer boards - M. D. Lippman, E. S. Donn; 
Electronics; January 18, 1979; pp.113-9.
Microcomputer for emulation bares hidden busses, 
functions - J. Moon; Electronics; July 17, 1980;
pp. 126-129.
Self-testing supercells; Alternative test strate­
gies - D. K. Bhavsar; Autotestcon '80*; pp. 135- 
139.
Off line, built-in test techniques for VLSI cir-
✓
-9-
cuits - M. G. Buchier, M. W. Sievers; Computer; 
June 1982; pp. 69-82.
1.66) Design of self-checking MOS-LSI circuits, applica­
tion to a four-bit microprocessor - Y. Crouzet, C. 
Landrault; FTCS-9*; pp. 189-192.
1.67) Memory finds and fixes errors to raise reliability 
of microcomputer - A. Heimlich, J. Korelitz; 
Electronics; January 3, 1980; pp. 168-172.
















BUILT IN TEST EQUIPMENT
IN-CIRCUIT EMULATION
CHIP SYSTEM






































FIGURE 1.1(a) TEST STRATEGIES



























Bed of nails Edge connector
Test clip Test clip
Guided probe























Error detection processor 
Cl.48,1.493




















Increased internal visibility 
Cl.633
Self testing supercells 
Cl.643




Redundancy £ comparison 
Cl.56,1.57,1.653









FIGURE 1.1(c) TEST STRATEGIES
CHAPTER TWO : EXISTING TECHNIQUES AND EQUIPMENT
2.1 : SIGNATURE ANALYSIS
Introduced in 1977, signature analysis (SA) compresses a 
data stream into a four digit signature by means of a 
sixteen bit shift register with linear feedback C2.13, as 
shown in Fig. 2.1. This signature represents the nodal 
activity during the defined period of measurement and is 
virtually unique. A defined and repeatable stimulus must 
be created at each node, this being derived from one of 
three sources, as indicated in Fig. 1.1.
Having located a faulty board, the signatures of the 
various nodes on that board are observed and compared with 
those obtained from a known good board, as shown in Fig. 
2.2. This allows a fault to be located to component 
level. The technique can detect all single bit failures 
and 99.998% of multi-bit failures C2.23. However, the 
signature contains no diagnostic information whatsoever, 
it is purely a go/no-go test.
Limitations of the technique are as follows:
1) Testing the CPU requires on-board provision to break 
the data bus and force an instruction which causes the 
address bus of the processor to cycle; for example, a 
no-operation.
2) All other feedback loops must have a means of being 
broken, in order to prevent an unstable signature 
caused by a fault propagating around that loop.
3) If the stimulus is derived from on-board software, 
then part of the system ROM will have to be allocated 
for this purpose and hardware added to provide the 
start, stop and clock signals required by the analyser 
C2.13.
4) SA cannot cope with asynchronous events, so operations 
such as direct memory access or interrupts must be 
synchronously driven, or tested independently.
-14-
5) Normally static nodes must be activated. This gener­
ally means more hardware, which could itself be a
source of failure.
6) SA cannot handle analogue signals.
7) SA cannot cope with timing or noise problems.
8) Considerable time might have to be spent in document­
ing signatures from a known good board, deriving 
troubleshooting trees and creating fault dictionaries.
SA can thus diagnose faults to component level, but it may 
require additional software and hardware to accommodate 
it. However, it is usually necessary to locate the fault 
to a board or a module first, so that signatures do not 
have to be taken from every node in the system. The 
technique can be considered as the digital equivalent of 
analogue signal tracing, whereby signatures, not wave­
forms, are compared with those on the circuit diagram,
see Fig. 2.2.
One of the main attributes of SA is that it can be used by 
a technician who is not familiar* with the system being 
tested, or, for that matter, familiar with computer fault 
finding techniques. The technician just needs to know 
what to run and where to probe, taking appropriate action 
when an incorrect signature is detected.
The signature analyser itself, in comparison with other 
equipment, is not expensive, but if a basic analyser is to 
be used, then the hardware and software required in a 
system to accommodate it will create additional cost over­
heads. This hardware and software has to be considered at 
the design stage of a system and prevents SA being applied 
to systems already in use (circuits cannot easily be 
added). However, this problem is overcome by using in- 
circuit emulation as the circuit stimulation. An instru­
ment is available which combines the two techniques C2.33.
Recently, Hewlett Packard has introduced a dedicated 
instrument to use in conjunction with a standard analyser,
-15-
which provides the stimulus, a set of pre-programmed tests
(one of these is a processor test) and provision for dedi­
cated tests C2.43. However, it is not disclosed how they
effect circuit partitioning and the breaking of feedback 
loops, without modification to the board being tested.
Literature tends to indicate a widespread use of SA C2.53, 
but of the twelve companies surveyed, only one was using 
the technique. So far, however, this company is not using 
it for production testing. It is still being evaluated in 
development circles. Results so far according to them are 
good. Good devices have been replaced with those having 
faults and the faulty devices successfully located using 
SA. Their design will take into account SA, by, for
example, allowing a 'watchdog' timer to be disabled during 
SA evaluation, a circuit which will have to be checked 
with conventional techniques.
The reaction from another company was that they would 
never use the technique, because 'the stimulus has no 
correlation with system operation, it is just a random 
stimulation'.
Generally though, the technique should be viable for pro­
ducts manufactured in hundreds off, which can more easily 
justify the time to document signatures and consideration 
during their design. However, as indicated later, compan­
ies producing large quantities still tend to be using 
automatic test equipment.
2.2 : LOGIC ANALYSIS
A logic analyser (LA) is probably the most powerful tool 
for all aspects of digital logic, particularly devel­
opment, debugging and fault diagnosis. However, it is not 
often used for production testing at present.
Logic analysers essentially fall into two categories:
-16-
1) Logic State : Used principally for the observation and
debugging of software, but also used to locate hard­
ware failures which cause software errors. Fig. 2.3 
shows how information is displayed on a typical 
logic state analyser.
2) Logic Timing : An extension of a conventional oscillo­
scope which can generally display up to sixteen 
waveforms of nodal activity on its screen, but where 
any point on a waveform can only occupy one of two 
levels, a logic '1' or a logic "O', as shown in Fig. 
2.4. This allows the relative action of waveforms 
to be observed and precise timing measurements to be 
made. It is therefore a tool for hardware evalua­
tion, debugging and testing.
It is now usual to find both types of machine combined 
into one instrument. Certain logic analysers are current­
ly becoming even more powerful, due to the inclusion of a 
digital oscilloscope within their frame C2.6D.
The power of the instrument is a result of major develop­
ments during recent years and lies with the features it 
can offer, which include:
1) General:
a) Microprocessor controlled.
b) Menu driven from a keyboard.
c) Clocking rates up to 500MHz.
d) Synchronous and asynchronous clocking simul­
taneously.
e) Can handle multiphase clocks and multiplexed
buses.
f) Voltage thresholds for TTL, MOS and ECL, or 
variable.
g) Comparison/reference memory for state and tim­
ing.
h) Operating parameters and memory can be stored
externally.
-17-
i) Can handle standard buses, eg. IEEE488 and 
RS232.
j) Has self and probe tests.
2) State Analyser (synchronous clocking):
a) External clock and control inputs so that the LA 
samples at the same time as the processor.
b) Minimum of 32 input channels.
c) Minimum of 64 sampled words.
d) Data formatted in binary, hex., octal, decimal 
or ascii.
e) Disassembly for a specific processor.
f) Search for a specific word.
3) Timing Analyser (asynchronous clocking):
a) Minimum of 8 input channels.
b) Minimum of 1000 sample points.
c) Traces displayed in pages (for more than 16).
d) Trace labels using ascii characters.
e) Trace order definable, with traces available on 
more than one page.
4) Signal Conditioning:
a) Sample - samples at a clock edge.
b) Glitch - detects threshold transitions between
its sample points.
c) Latch - allows the definition of complex tim­
ing arrangements between the clock and control 
signals.
d) Demultiplex - given appropriate control signals,
multiplexed buses are automatically demulti­
plexed.
5) Trigger Modes:
a) Input pattern recognition.
b) Post and pre-trigger.
c) Trigger qualifiers.
d) Nested or sequential trigger levels.
e) Arm and trigger.
✓
-18-
f) Delay by events or count.
g) Pass counts.
Despite all the features listed above, the use of logic 
analysis still seems to be reserved for design, develop­
ment, debugging and special test benches.
Of the twelve companies surveyed, most had some form of LA 
but only used it in the testing process for systems which 
defied fault analysis by any other method. Logic analysis 
for example, is the only method to resolve problems such 
as noise, timing and general device incompatibility.
The reluctance to use logic analysers for production test­
ing is attributed to:
1) The complexity of the instrument.
2) The infinite number of possible connections to the 
system under test.
3) Difficulty of incorporation into an automated test 
procedure.
In order to use its full capacity, a good working know­
ledge of the machine and its facilities is essential. If 
an error in a system is to be rapidly eliminated then the 
area of the fault must be known, along with the best 
connection of the analyser probes to locate it. A good 
knowledge of the system being tested is therefore also 
required. The operation of a LA is fairly straight­
forward to master, especially if all information is 
entered via a keyboard in sequenced menus which indicate 
when an input error has been made. In addition, help or 
operating manual pages within the machine greatly assist 
its use.
The level of operator interaction required with a logic 
analyser, which has already been reduced, will continue to 
decrease with:
-19-
1) Automated loading of set up parameters and reference 
memory from a host computer, or from an integral 
floppy disk unit.
2) Guided probe(s> analysis.
2.3 : AUTOMATIC TEST EQUIPMENT
Automatic test equipment (ATE) is the most common form of 
board and system testing. Two principal techniques are 
used with ATE; in-circuit testing and functional testing.
2.3.1 : In-Circuit Testing
In-circuit testing consists of a 'bed of nails', a set of 
spring loaded contacts pressed onto the wiring side of a 
printed circuit board by a vacuum, with (ideally) one nail 
for every circuit node on the board. There is then elect­
rical access to every node in the circuit.
Bare boards can be tested for open circuits and short 
circuits before assembly. Simple components such as 
resistors, capacitors, diodes, transistors and op-amps can 
be isolated with the use of guarding techniques C2.73 and 
then easily checked. Circuit models are required for more 
complex integrated circuits. These models would be truth 
tables for the basic logic gates. Howevever, for certain 
VLSI devices, such as microprocessors, circuit models are 
not easily obtained, even from the manufacturer.
Unless standardised (PCB designed on a matrix), a unique 
bed of nails is required for each board tested. These are 
expensive and, therefore, hard to justify for low volume 
production. However, in spite of all cost considerations, 
it is the ease of fault isolation, due to the internal 
visibility and controllability of the board via its bed 
of nails, which makes the technique so attractive.
A device is tested by forcing its inputs to a known state 
from the application of fast high current pulses which
-20-
override existing states, whilst simultaneously monitor­
ing its output(s), as shown in Fig. 2.5. Lines can also 
be overdriven to isolate or partition components. The 
rate at which test data can be applied is limited by the 
interface electronics used for the bed of nails and the 
quantity of data which has to be processed.
Thus an in-circuit test checks individual component per­
formance and board workmanship but cannot detect dynamic 
or interactive characteristics. It should also be able to 
handle both mixed logic and analogue electronics.
2.3.2 : Functional Testing
In a functional test, the board is exercised via its edge 
connector(s). Fig. 2.6 shows the structure of a typical 
tester. Stimuli for the board can be generated in a num­
ber of ways:
1) Software simulation : The components and connections
on the board are described'to the software, which 
will produce a gate level equivalent model (although 
the availability problem of certain device models 
arises again). The simulator then generates test 
programs, patterns and reference data.
2) Vectors : Fixed repeatable pseudorandom patterns are
used to generate input vectors. The resulting out­
put vectors are then compared with those from a 
fault free board.
3) Synthesis of address and data patterns : A programma­
ble tester is used for this purpose.
As the complexity of a board increases, it is difficult to
adequately test it solely from its edge connector, so
functional testing is usually assisted by guided probing.
This part of the technique uses a hand held probe to
increase the depth of penetration and observation on the 
✓
-21-
board. Probe sequences are generated by circuit simula­
tion.
The fixture cost for functional testing is low and a board 
with good visibility allows faults to be isolated accurate­
ly and quickly. Digital and analogue boards can be tested.
2.3.3 : In-Circuit and Functional Testing Combined
Combining the two techniques, as illustrated in Fig. 2.7, 
overcomes the shortcomings of each. Stimulation is pro­
vided as above, whilst the circuit response is checked by 
signature analysis, a stored correct response, in-circuit 
testing or probing.
The level of fault detection achieved by combining the 
techniques is better than either used independently. In 
addition, an effective test for timing has been achieved 
in one system C2.83.
Segmentation of the circuit and * component isolation is 
achieved via the bed of nails on which the automatic 
probing is carried out. The results can then be applied 
to a fault isolation tree and the suspected failure indi­
cated on a VDU/terminal.
A typical strategy for testing a microprocessor based 
system is:
1) Standard short circuit test.
2) Test discrete components.
3) In-circuit test of microprocessor.
4) Bus check for shorts.
5) Sequence address bus to confirm address decoding.







General drawbacks of ATE are:
1) Low throughput.
2) Poor test quality.
3) High cost.
2.3.4 : ATE In Use
Major computer manufacturers use ATE almost exclusively to 
test their boards at a central base. Boards having faults 
which are unresolvable using ATE are scrapped, simply 
because it is not cost effective to test further.
The only company surveyed which had volume production uses 
ATE extensively throughout their manufacturing operations. 
Devices such as microprocessors and associated program- 
able chips, are not evaluated using ATE, but after the 
rest of the board is tested, these are inserted and normal 
application software run to verify them.
One company, whose boards are well defined for document­
ing and testing, is in the process of ATE installation but 
another said that the volume of their product could not 
justify sophisticated ATE, since "it would not be cost 
effective'. Yet another was using in-circuit testing via 
edge connectors, EPROM and processor sockets, followed by 
a system test.
In other quarters, extensive use is made of analogue and 
digital models, with the philosophy that if all the 
interconnections and components are faultless, then the 
board must work - the design says so. PCBs there are 
designed around a 0.1" matrix, using computer aided design 
to allocate components, test points, isolation paths and 
so on. The computer also produces the PCB layout and jig 




2) Component checks using static models.
3) Components checks using dynamic models.
4) Components checked in clusters.
5) Functional test of system as in normal operation (no
testing as a general computer).
2.4 : SELF TEST
Probably the most important form of testing is self test; 
software written to interrogate the various modules of a 
system and to report on failures found. Alternatively, it 
can be used as a go/no-go test with zero diagnostic infor­
mation. It is used in just about every stage of every 
product, from design and development, through to produc­
tion testing and field testing.
The software can be written to test the system as a 
generalised computer, or written to test only those 
operations used in the application program. A typical
strategy for the former is given*in Fig. 2.8. It need 
not, however, be restricted to testing but can also be 
used for performance verification and auto-calibration.
Self test programs can be run:
1) On power up to provide an automatic confidence check.
2) On line: a) as a foreground task.
b) as a background task.
3) Off line.
The tests can provide rapid go/no-go diagnostic checks for 
each module in a system, or exhaustively verify the 
complete system; for example during test by the manufac­
turer where comprehensiveness is more important than 
testing time.
The software itself can be:
-24-
1) Resident in ROM.
2) Down loaded from a host computer using:
a) DMA.
b) RS232.
3) Loaded from disk.
4) Loaded from cassette or paper tape.
Advantages of self test include:
1) Tests are easily generated.
2) Modules can be quickly tested, one after another.
3) No test probes to move around.
The fundamental limitation of the technique, however, is 
that a certain amount of the system, CPU, ROM (possibly 
RAM as well) and associated buses, must be fully function­
al before the program will run at all. Once this has been 
established, then self test is a powerful tool. It means
that devices can be readily exercised in conjunction with
others at their normal operating speeds. Given a faulty 
component or module, then specific programs can be written 
to probe further, obtaining more information about the 
fault.
The fundamental limitation of self test, detailed above, 
can be overcome by running the programs from an external 
source via buffers for isolation. This falls into the 
domain of in-circuit emulation, which is considered next.
If the software is stored in on-board ROM, then this 
creates an additional system overhead but a minimum of 
external connections are required.
Another area of concern with self test programs is their
fault coverage; i.e. the type and quantity of faults they
are able to detect. Ideally, all faults should be detect­
ed. This problem is analysed by Tasar C2.93, who con­
cludes that the level of fault coverage is dependent on 
how well written the programs are to start with.
The major computer manufacturers have test software resi­
dent in ROM, on dedicated diagnostic boards, loaded from 
disk, or transmitted along a telephone line from a remote 
diagnostics centre. Some of this software is extremely 
powerful, locating faults to component level.
One company has a system consisting of a master controller 
connected to subcontrollers via an RS232 link, where each 
controller has its own microprocessor. Test software is 
resident in each module, but additional software can be 
downloaded from the master to the subunits via the RS232 
link when it is required.
Another form of maintenance strategy used by several 
companies is the line replaceable unit (LRU). This is a 
unit which runs tests to check itself and other modules in 
background and/or foreground modes, indicating any fault 
condition(s) found on lights or a dedicated message dis­
play. When a fault occurs the LRU can be easily replaced. 
All the individual boards of the faulty LRU will be 
tested, sometimes on site but more generally at a central 
repair base, proceeding to a system test if the fault is 
not found and finally being given to the boffins to sort 
out in sheer desperation.
Standard checks of CPU, ROM, RAM and I/O may be carried 
out in a background mode, whilst checks on current levels 
and other vital parameters are carried out in the fore­
ground. Some companies prefer to perform a series of 
'actual' tests, checking the system as it actually oper­
ates and not as a general computer.
2.5 : IN-CIRCUIT EMULATION
Microprocessor emulation is the most common form of in-
circuit emulation (ICE) in microprocessor based systems,
since the processor is the focus for all system functions.
The operation of the system processor is emulated by a 
✓
-26-
similar processor in the ICE equipment, which is connected 
via a cable to the socket of the system processor. Fig. 
2.9 shows a typical in-circuit emulator and its connec­
tion to the system under test (SUT). No modifications 
are required to the SUT. The processor in the emulator is 
totally controllable which, together with the resources of 
the instrument, allows hardware and software to be easily 
evaluated and debugged. It is therefore a powerful devel­
opment tool.
Operationally, ICE is similar to ATE in that test patterns 
or vectors are generated and the response of the circuit 
to them is observed.
Generally only the clock of the SUT must be functional to 
carry out ICE, so diagnostic software in the emulator can 
exercise all components in the SUT, isolating faults to 
module or subsystem level.
Typical features of ICE are:
1) Real time transparent emulation of processor.
2) Control over CPU registers, I/O ports and memory.
3) Breakpoint s.
4) Menu driven.
5) Subsequent run comparison (reference memory).
6) Trace memory.
7) Write protection.
8 ) Frequency and pulse measurement.
Emulators typically have three operating modes:
1) Interrogation:
a) Display and modify CPU registers, including pro­
gram counter and stack pointer.
b) Display and modify memory and I/O ports.
c) Set breakpoints for a number of different condi­
tions .
d) Display trace details from high speed memory.
*
-27-
e) Display measured execution time.
2) Run/Emulation:
a) Execute test programs, testing and comparing in­
formation.
b) Display register or memory contents, along with 
address or data bus states on a sampled basis.
3) Single Step:
as for 2), but one cycle at a time.
Emulation must be able to cope with all current and 






f ) Asynchronous communication.
g> Multiprocessors.
h) Cache and queue memory.
i) Instruction prefetch.
In addition, with systems supporting in excess of one 
mega-byte of memory and, therefore, making extensive use 
of memory managers, system emulation must now be consider­
ed, whereby the processor and the memory manager are 
emulated together to distinguish between logical and phys­
ical addresses. System emulation must also cope with 
shared memory C1J33.
Advantages of ICE are:
1) Software is written in the assembly language of the 
target processor (or even a higher level language).
2) The system is tested as used (CPU instructions), in­
stead of a meaningless stream of bits.
3) A working kernel is not required (the minimum amount
-28-
of CPU and ROM necessary to run a program).
4) A system overhead is not created for testing (no 
additional system ROM).
5) Minimal test fixtures are required (no bed of nails or 
edge connector).
6 ) Hardware and software can be tested and debugged under 
tightly controlled conditions (eg. start,step,stop).
7) Tests can be executed in real time at normal or 
abnormal clock rates.
Limitations of ICE are:
1) A socketed CPU is required, which could be a source of 
unreliability. This is especially a problem in mili­
tary applications.
2) Problems are generally located to module not component 
level due to a lack of visibility.
As far as the use of ICE in industry is concerned, the 
company survey revealed that, like logic analysis, it 
seems to be reserved for development use only.
Extensive ICE has been practically implemented on a 
specific microprocessor based system, using a Millenium 
Microsystem Analyser (in-circuit emulator). This work is 
detailed in Appendix A.
2.6 : COMBINED TECHNIQUES
All the techniques considered so far have limitations as 
indicated. The tendency now is to combine techniques in 
one instrument, so overcoming at least some of the in­
herent problems of each. The combination of logic state 
and timing analysis in one instrument is already standard, 
with the combination of in-circuit and functional ATE 
becoming so C2.103. Other techniques available together 
are:
1) ATE with self test C2.11D.
-29-
2) ICE with SA 12.3,2.123.
3) ICE with logic state analysis C2.133.
4) ICE with complete logic analysis C2.143.
5) Logic analysis with pattern generation (ATE) C2.15, 
2.163.
6) Logic state analysis with SA C2.173.
In a previous paper C2.183, it was suggested that ICE and 
logic analysis would be the most effective combination. 
Individually, these two techniques are not widely used for 
production testing but they are nevertheless powerful 
diagnostic tools. This is still believed to be the case 
for the techniques considered so far, with proposed 
operation as follows. Tests and stimuli generated by the 
emulator would be monitored by the LA using guided probes. 
State and timing information could then be recorded 
automatically and compared with a set of expected res­
ponses. Resulting error information should be displayed 
in state, timing and English formats. The LA would also 
be available to deal with time related failures and 
glitches. Instruments are currently available which com­
bine these two techniques, but they do not function exact­
ly as outlined above.
Self test, as has already been indicated, appears at some 
point in the life of every processor based system, which­
ever overall single or combination of techniques is used 
to test it. It will also continue to do so, since in 
terms of generation, implementation and cost it creates 
minimal overheads.
Test instruments developed to cope with increased circuit 
complexity are certainly becoming more sophisticated. One 
such instrument C2.193 runs a 'learn' algorithm to in­
terrogate the system and generates a memory map of all 
devices found. It then proceeds, based on this informa­
tion, to run a series of pre-programmed tests. Also part 
of this equipment is a hand held probe, which can observe
nodal activity or provide stimulus in synchronism with
✓
-30-
operations in the system under test.
2'7 : PROVISION FOR TEST
Design for testability has become a major requirement for 
all microprocesor based systems, especially those diagnos­
tic techniques which require built in provision to use 
them. This built in provision for test may vary from 
special test points or memory for test software, to 
removable wire links or additional gates to allow external 
control of devices. In this chapter it has been built in 
test (BIT) at board level for subsequent external testing. 
More significantly now and certainly in the future, is BIT 
at board level for internal (self) testing and BIT at chip 
level for self checking chips. These techniques (see Fig.
1.1) are considered in the next Chapter.
-31-
2.8 : REFERENCES
2.1) Fault diagnostics for microprocessor based systems 
- T. J. Hollis; Project report for the degree of 
B.Sc. in Electrical Engineering; University of 
Bath, England; 1980; pp. 13-15.
2.2) Instruments - A. Santoni; EDN; July 20,1979; p. 97.
2.3) Combining diagnosis and emulation yields fast fault
finding - L. Badagiliacca, R. Catterton; Electron­
ics; November 10, 1977; pp. 107-110.
2.4) Applying signature analysis to existing processor- 
based products - R. Rhodes Burke; Electronics; 
February 24, 1981; pp. 127-133.
2.5) Signature analysis wins new acclaim - M. Marshall;
Electronics; February 14, 1980; pp.103-4.
2.6) Logic analyzer model 1631/AD...the one with the
scope; Data sheet 5954-2614; Hewlett Packard; USA; 
1985.
2.7) 3062A board test system; Data sheet 5953-6934; 
Hewlett Packard; USA; 1983.
2.8) Test LSI boards functionally on an in-circuit 
tester - T. Jackson, P. Vais; Electronic Design; 
October 29, 1981; pp. 137-144.
2.9) Analysis of fault detection coverage of a self-test 
software program - V. Tasar; FTCS-8*; pp. 65-71.
2.10) ATE swings toward merged in-circuit, functional 
tests - J. McLeod; Electronic Design; October 29, 
1981; pp.115-9.
2.11) Quality and control self-test - R. W. Spearman, F. 
D. Patch; 1980 Test Conf.*; pp.279-286.
2.12) Plan a mixed strategy to simplify uP systems tests 
- B. Hordos; EDN; April 20, 1979; pp. 183-187.
2.13) Emulator solves system-level debug problems - R. 
Parks, K. Greenberg; Electronic Design; May 14, 
1981; pp.173-180.
2.14) Analysers, emulators: Together they stand for 
better testing - R. Allan; Electronic Design; June 
25, 1981; pp 75-92.
2.15) Flexible pattern generation aids logic analysis
S. Palmquist, M. Pettet; Electronic Design; 
October 15, 1981; pp. 165-174.
2.16) Logic analyser delivers test patterns too - S. 
Palmquist, G. Hoeren; Electronics; September 8,
-32-
1981; pp.113-9.
2.17) Logic state and signature analysis combine for easy 
testing - I. H. Spector; Electronics; Vol. 51, No. 
12; June 8, 1978; pp. 141-5.
2.18) The maintainability of processor systems - T. J.
Hollis; Microtest '81; 21-24 September 1981;
University of Bath, England; Society of Electronic 
and Radio Technicians; pp. 110-123.
2.19) Portable tester learns boards to simplify service 
- D. Cassas; Electronics; June 16, 1981; pp. 153- 
160.
































8 BIT DATA BUS
FIGURE 2.2 CIRCUIT DIAGRAM WITH SIGNATURES















ADDRESS STATUS/ DATA 
CONTROL
TYPICAL DISPLAY ON A LOGIC STATE 
ANALYSER
-34-
FIGURE 2.4 TYPICAL DISPLAY ON A LOGIC TIMING
ANALYSER
PATTERN
MEMORY W ORT CCKTRCLUD
DRIVE



































(IS UEASURMXT paxrs rtx CARD)WAVC
TEST ADAPTER
DTU - DIGITAL TEST UNIT 






















TEST LOGIC USING 
DIGITAL OVERDRIVE
BED OF NAILS 
(IN-CIRCUIT TESTING)




EDGE CONNECTOR WITH 
LIMITED BED OF NAILS
BED OF NAILS AND EDGE CONNECTOR 
- FULL FUNCTIONAL CAPABILITY









NO t DISPLAY ROMKUM UKf ERROR









CHECK CRT INTERFACE 
& DISPLAY RAM









(  SYSTEM OK )
NO
NO B DISPLAY KEYBOARD
ERROR







FIGURE 2.8 A COMPLETE SELF TEST PROGRAM
-37-





































MODULE k / \/ PROBE


























FIGURE 2.9 STRUCTURE OF A TYPICAL IN-CIRCUIT EMULATOR AND ITS CONNECTIONS
CHAPTER THREE : BUILT IN TEST - AN INTRODUCTION
Microprocessors are typical of VLSI circuits, so a lot of 
attention has been given to their testing C3.1-3.153. 
Devices in a microprocessor based system are difficult to 
test because:
1) The number of possible faults is extremely large. 
There are thousands of gates and interconnecting 
lines, all subject to failure, and thus a successful 
test will require a large number of test patterns.
2) Access to internal components and lines is severely 
limited by the number of I/O connections; say 5000 
gates but only 40 I/O pins.
3) New and complex failures such as pattern sensitivity 
occur.
4) A complete description of the device is generally not 
available. Microprocessor specifications, for exam­
ple, typically consist of a register level block 
diagram, a listing of the instruction set of the 
processor and some information on system timing.
These points highlight the problem areas that built in 
test (BIT) must overcome. This chapter introduces four 
concepts to describe and specify BIT techniques. These 
are:
1) Faults and Fault Models - the types of fault to be
detected, and their effect on circuit operation.
2) Observability and Controllability - the level of
access to internal components.
3) Design for Testability - the mechanisms to implement
during design.
4) Redundancy - the quantity of additional hardware and
software for test purposes.
3.1 : FAULTS AND FAULT MODELS
Faults fall into two categories; permanent (solid) and
-39-
intermittent (transient). Intermittent failures are the 
most difficult to test for and can be caused by timing and 
noise problems, or clock waveform and power supply degra­
dation. A test sequence which will detect solid faults in 
one pass may have to be repeated many times to stand any 
chance of detecting intermittent failures.
A combinational circuit may be tested by applying all 
possible input combinations (test patterns) to verify its
truth table. If the circuit has twenty inputs, for exam-
20 6 pie, then this means 2 = approx. 10 patterns. However,
all single and multiple faults will be detected (assuming
they are detectable).
Sequential circuts require a minimum of s2n test patterns, 
where s is the number of stable states, which must be 
arranged as a specific checking sequence, a difficult and 
time consuming task.
The quantity of test patterns and therefore test time for 
circuits with a significant number of inputs is large. To 
reduce both, it is necessary to test for a subset of all 
possible faults. A list of likely failures is construct­
ed, fault models created to represent the logical charac­
teristics of these faults, input patterns generated to 
test for them (test pattern generation) C3.16-3.183 and 
test verification implemented using fault simulation 
C3.163 to determine the faults actually detected (fault 
coverage).
Faults and fault models as applied to VLSI are now 
considered.
3.1.1 : Stuck-At Fault Model
The stuck-at fault model assumes that a logic gate has an 
input or output permanently stuck at either a logic 0 or a 
logic 1. Fig. 3.1a shows a fault free AND gate, whilst 
Fig. 3.1b shows the same gate with a stuck-at-1 (SA1)
fault. The faulty gate sees input A as a 1 irrespective 
of the actual level applied to it.
A circuit might not physically have a stuck-at failure (a 
node literally connected to OV or the positive power 
rail), but exhibits the characteristics from an operation­
al point of view. Consider the circuit in Fig. 3.2, a 
typical NAND gate. If the base-emitter junction of input 
A (b-e^) is open circuit, then input A appears to be SA1, 
whereas a collector-emitter short on TR4 looks like output 
Y stuck-at-0 (SAO).
If a network has a total of N gate inputs and outputs,
then since each of these may be SAO, SA1, or correct, then
Nthe total number of fault combinations is 3 -1. So for a
circuit with thirty gates, each having two inputs, there
14would be approximately 2 x 10 faulty states. This is 
far too many faults to assume, as the time required to 
generate and implement tests for them would be imprac­
tical.
In order to overcome this problem, the model of a single 
stuck-at fault has been assumed for many years, in which 
a faulty circuit is considered to have one single stuck-
at failure only. Thirty 2-input gates will then have one
hundred and eighty possible single stuck-at faults.
The quantity of faults can be further reduced by fault 
equivalencing C3.163. Consider the AND gate in Fig.
3.3 with one of its inputs SAO. The effect of this fault 
is equivalent to the output Y SAO, so for test purposes 
only one of the two faults needs to be considered.
The single stuck-at fault model might be suitable for SSI 
and MSI, but LSI and VLSI have other faults (excluding 
manufacturing defects and design faults C3.19-3.213) which 
cannot necessarily be modelled as stuck-at failures. 
These are detailed next.
-41-
3.1.2 : LSI/VLSI Failures
1) Pattern Sensitive Failures : These are failures ob­
served only when a specific set of states (logic 
levels) or sequence of states occurs. These are 
particularly prevalent in memory. Example failures 
are the non recognition of a single 0 (1) following 
a string of l's (0's) due to unwanted hysteresis 
effects, or the generation of noise (crosstalk).
2) Transformation Failures : These are failures which
create feedback, so that sequential behaviour or 
oscillation is exhibited from a combinational cir­
cuit. An example of this is the input short 
illustrated in Fig. 3.4. CMOS is particularly 
susceptible to these failures.
3) Environmental Failures : These are failures caused by
heat, vibration and humidity, as well as electrical 
and electromagnetic interference.
4) Parametric Failures : These are failures such as
resistance or transistor gain variation due to 
component aging.
A lot of these faults will be intermittent and difficult, 
if not impossible, to model as single stuck-at faults. In 
some cases an electrical model might be more appropriate 
than a logical model. Galiay C3.223 comments on this and 
details failures related to integrated circuit (IC) con­
struction and layout. He then suggests IC layout rules to 
assist test, which decrease failure types and avoid those 
which are difficult to detect.
The inadequacy of the stuck-at fault model for VLSI is 
considered by Nickel C3.233, who presents nearest neigh­
bour and neighbourhood interaction fault models, princi­
pally for memory.
-42-
Two other important classes of faults are bridging and 
unidirectional faults.
3.1.3 : Bridging Faults (shorts)
In general, the stuck-at fault model does not cover 
bridging faults. If two active lines are shorted togeth­
er, one at logic 0 and the other at logic 1, the resulting 
level could be a 0, a 1 or indeterminate. The individual 
source impedances principally determine which level will 
win, but in practice either an ORing or an ANDing 
operation will take place. For the OR situation, a short 
circuit, as indicated in Fig. 3.5a, can be modelled as an 
additional gate, as shown in Fig. 3.5b.
Bridging faults in particular are a problem in program­
mable logic arrays (PLAs) C3.243.
3.1.4 : Unidirectional Faults
Unidirectional multiple errors are those where all erron­
eous bits are the same; either l's changed to 0's or 0's 
changed to l's. For example, a unidirectional error can 
change 0110 to 1111 or 0000, but not 1001. The failure of 
a power lead to or in an IC, such that a number of its 
data lines become stuck at the same logic value, creates 
a unidirectional error. Alternatively, a single fault in 
the address decoding circuitry of a ROM chip could cause 
the correct word and an incorrect word to be simultaneous­
ly selected. Externally, the words would generally appear 
to be ORed together, resulting in a word with more l's 
than the correct word. This fault causes data dependent 
unidirectional errors, whereas the power lead failure 
produces a permanent undirectional error. This type of 
fault can also occur in the transmission of serial data 
from a device having a single failure.
3.1.5 : Functional Fault Model
-43-
Since the information required to generate logical or 
electrical models is generally not available for VLSI, 
then a technique proposed by Thatte C3.2,3.11,3.143 makes 
use of the information which is. For microprocessors, 
this means using the register level block diagram and 
instruction information to produce a functional fault 
model. Rather than testing each instruction with various 
operands and exercising the internal registers, various 
functional modules are defined from the architectural 
organisation and instruction information, which are inde­
pendent of their actual implementation. Each functional 
unit is allowed to have one functional fault, such as the 
wrong instruction or no instruction executed (although 
these may be caused by stuck-at faults) and test patterns 
generated to locate these faults.
Marchal C3.253 extends these principles using the single 
or dual internal buses of a microprocessor.
3.2 : OBSERVABILITY AND CONTROLLABILITY
The principal aim of any BIT technique is to increase the 
observability and controllability of a network.
If system inputs are fed to an AND gate, as shown in Fig. 
3.6a, then these inputs are directly controllable. Simi­
larly, if the output from the AND gate is a system output, 
then it is directly observable. If this AND gate becomes 
part of a large combinational circuit, as indicated in
Fig. 3.6b, then it is more difficult to control and
observe this one gate. Going one step further, if the AND
gate is embedded into a much larger sequential circuit
(VLSI), then the problem gets much worse.
The testability of a circuit is therefore directly related 
to the difficulty of controlling and observing internal 
nodes from system inputs and outputs respectively. Ideal­
ly, every node should be controllable and observable.
-44-
A method of analysing the controllabilty and observabili­
ty for combinational and sequential networks has been 
proposed by Goldstein C3.26D.
3.3 : DESIGN FOR TESTABILITY
Built in test (BIT) requires design for testability, a 
provision for test at the design stage. Design for 
testability is broadly divided in to two categories; the 
so called 'ad hoc' approach or the structured approach. 
This chapter is concerned with the structured approaches, 
but since these frequently extend the concepts of the ad 
hoc methods, it is worth listing some of these ad hoc 




i) Separate boards for different areas/functions 
of the system; for example, CPU, memory and 
I/O cards.
ii) Removable wire links, either on the board or
external to the board. These could be feed­
back paths which can then be broken.
b) Logical : Additional logic and control lines to
allow direct control of internal modules, see Fig. 
3.7, external control of the clock, see Fig. 3.8, 
or direct observation and control of module inter­
connections, see Fig. 3.9 C3.16,3.27-3.293. The
latter technique is particularly suited for exter­
nal control of microprocessor buses, so that
perhaps switches can set up the desired informa­
tion C3.303.
2) Test Points C3.16,3.29,3.31,3.323:
a) Additional observation points.
b) Additional observation/control points (as re­
quired for logical partitioning).




a) Single LEDs to indicate the state of power 
rails, serial data lines and general gate/chip 
inputs or outputs C3.293.
b) Latches and seven segment displays for bus in­
formation C3.30,3.333.
4) Microprocessor Single Step : Additional circuitry
which allows a microprocessor to execute a single
instruction or clock cycle C3.28,3.333.
5) Signature Analysis (see section 2.1).
6) Self Test Programs (see section 2.4).
7) Watch Dog Timer C3.343: Checks the response of the
microprocessor to interrupts, or checks that soft­
ware or hardware processing time is within a 
predetermined limit.
8) Control Line Access : Allows normally static control
lines to be externally controlled by using tie-up or 
tie-down resistors C3.313.
9) Standard Bus Structures : Well defined bus structures
such as the Microbus C3.353, the G64 Bus C3.363, or 
the VME Bus C3.373 allow the control and observation 
of bus activity to be standardised for test purposes.
3.4 : REDUNDANCY
Most of the ad hoc and structured approaches use some form 
of redundancy to achieve improved testability; i.e. not 
necessary for normal system operation. Redundancy can be 
classified as follows:




2) Software : Additional routines such as self test
programs.
3) Time : If system checking is performed during normal
operation, then hardware can do this with no extra 
time burden. However, software checking will impose 
a time overhead (redundancy).
4) Information : Both hardware and software require in­
formation redundancy. They need and/or generate more 
information than is necessary for normal operation.
-47-
3.5 : REFERENCES
3.1) Microcomputer for emulation bares hidden buses, 
functions - J. Moon; Electronics; July 17, 1980; 
pp. 126-129.
3.2) Test generation for microprocessors - S. M. Thatte, 
J. A. Abraham; IEEE Trans. Comput.; Vol. C-29, No. 
6; June, 1980; pp. 429-441.
3.3) Testability considerations in microprocessor-based 
design - J. P. Hayes; Computer; March, 1980; pp. 
17-25.
3.4) Survey of approaches to testing and diagnosing 
microprocessor-based systems - W. Schmitt, E. 
Lynch; Autotestcon '80*; pp. 127-130.
3.5) The microprocesor testing revolution - E. S. Donn;
1979 Test Conf.*; pp. 121-124.
3.6) Design of easily testable bit-sliced systems - T. 
Sridhar, J. P. Hayes; IEEE Trans. Comput.; Vol. C- 
30, No. 11; November, 1981; pp. 842-854.
3.7) A testable design of general purpose microprocess­
ors - R. Parthasarathy; FTCS-12*; pp. 117-123.
3.8) Testing the new generation of microprocessors - P. 
Wohlfarth, D. Smith; 1979, Test Conf.*; pp. 255-257.
3.9) Testing internally clocked LSI - R. Winn; 1979 Test 
Conf.*; pp. 338-340.
3.10) Bit-slices microprocessors testing - a case study 
- M. El-Lithy, R. Husson; FTCS-10; pp.126-1.
3.11) User testing of microprocessors - S. M. Thatte, J. 
A. Abraham; IEEE Compcon '79; Spring 1979; San 
Francisco, California, USA;IEEE Pub. No. 79 CH1393- 
8C; pp. 108-114.
3.12) Testing the 8086 - M-G Lin, A. K. Yuen, K. Rose;
1980 Test Conf.*; pp. 426-431.
3.13) A functional test method for microprocessors - L. 
Shen, S. Y. H. Su; FTCS-14; pp 212-218.
3.14) Fault coverage of test programs for a microprocess­
or - J. A. Abraham, S. M. Thatte; 1979 Test Conf.*;
pp. 18-22.
3.15) Testing using a minimal number of instructions - P. 
K. Lala; Microprocessors and Microsystems; Vol. 5, 
No. 7; September 1981; pp. 295-298.
3.16) Design for testability - T. W. Williams, K. P. 
Parker; Proc. IEEE; Vol. 71, No. 1; January, 1983;
-48-
pp. 98-112.
3.17) Test generation techniques - S. B. Akers; Computer; 
March, 1980; pp. 9-15.
3.18) LSI logic testing - an overview - E. I. Muehldorf, 
A. D. Savkar; IEEE Trans. Comput.; Vol. C-30, No. 
1; January, 1981; pp. 1-16.
3.19) Microprocessor device reliability - E. R. Hnatex; 
1977 IEEE Southeastcon Region 3 Conference; pp. 
82-86.
3.20) Improvements in reliability of VLSI integrated 
circuits - M. Noble; IEE Electronics and Power; 
March, 1985; pp. 222-226.
3.21) Hardware logic design faults - a classification and 
some measurements - T. L. Faulkner, C. W. Bartlett, 
M. Small; FTCS-12*; pp. 377-380.
3.22) Physical versus logical fault models MOS LSI 
circuits; Impact on their testability - J. Galiay, 
Y. Crouzet, M. Vergniault; IEEE Trans. Comput.; 
Vol. C-29, No. 6; June, 1980; pp. 527-531.
3.23) VLSI - The inadequacy of the stuck at fault model 
- V. V. Nickel; 1980 Test Conf.*; pp. 378-381.
3.24) The effect of untestable faults in PLAs and a 
design for testability - D. K. Pradhan, K. Son, 
1980 Test Conf.*; pp. 359-367.
3.25) Updating functional fault models for microcomputer 
internal buses - P. Marchal; FTCS-15; pp. 58-64.
3.26) Controllabilty/observability analysis of digital 
circuits - L. W. Goldstein; IEEE Trans. on 
Circuits and Systems; Vol. CAS-26, No. 9; September 
1979; pp. 685-693.
3.27) Design for testability for microprocessor-based 
boards - R. Willis; Electronic Engineering; April, 
1983; pp. 67-72.
3.28) Designing boards for testability - G. Foley; New 
Electonics; Vol. 12, No. 23; November 27, 1979; 
pp. 98-102.
3.29) Practical troubleshooting techniques for micropro­
cessor systems - J. W. Coffron; Prentice-Hall Inc, 
Englewood Cliffs, New Jersey, USA; 1981; Chapter 8.
3.30) Save both time and money debugging a uP prototype 
board -M. M. Zyla; EDN; 5 September, 1978; pp. 
153-156.
3.31) Design for testability - J. Turino; Logical Solu-
✓
-49-
tions Inc.; Campbell, Ca., USA; 1979; Chapter 4.
3.32) Enhancing testability of large-scale integrated
circuits via test points and additional logic - M.
J. Y. Williams, J. B. Angell; IEEE Trans. Comput.; 
Vol. C-22, No. 1; January, 1973; pp. 46-60.
3.33) Testing microprocessor-based systems at home - C.
Carson; Wireless World; April, 1984; pp. 44,45,49.
3.34) Designing fail-safe microprocessor systems - D. R. 
Ballard; Electronics; 4 January 1979; pp. 139-143.
3.35) Microprocessor bus standard could cure designer's
woes - G. Force; Electronics; 20 July, 1978; pp.
113-118.
3.36) The G-64 route to microsystem hardware implementa­
tion - P. Wilson, C. Cordingley; New Electronics; 
12 November, 1985; pp. 43-46.
3.37) Decentralising uP bus grows easily from 16 to 32
bits - C. Kaplinsky; Electronic Design; 12 November
1981; pp. 173-179.

















FIGURE 3.2 TWO FAILURES IN A TTL NAND GATE
SAO
*
X  permanently 0
SAO
FIGURE 3.3 EQUIVALENT STUCK-AT FAILURES






























FIGURE 3.6 THE CONTROLLABILITY AND OBSERVABILITY OF A SINGLE GATE
1= Enable 
0= Inhibit




















FIGURE 3.9 OBSERVATION & CONTROL OF MODULE INTERCONNECTIONS
-53-
CHAPTER FOUR : TECHNIQUES FOR BUILT IN TEST
Built in test (BIT) allows testing to be implemented in 
one of two modes, on or off line. Ideally it has 
negligible effect on system performance and involves 
minimal space and cost overheads. At the same time, it 
seeks to transform the fault diagnosis of a complex 
circuit from almost impossible to a relatively straight 
forward operation. It also needs to minimise test 
generation, test procedures and the assistance of external 
equipment.
BIT ranges from well established (and widely used) coding 
techniques to the more recent and powerful concept of a 
Serial Shadow Register. Both of these are described in 
some detail in this Chapter, along with three more 
important forms of BIT; Scan Design, the Built In Logic 
Block Observer and Autonomous testing. Many other tech­
niques for BIT have been proposed, so a further selection 
are concisely detailed.
It is interesting to see what impact these techniques have 
had on chip manufacturers. A number of these techniques 
are being implemented at chip level and a summary is 
provided of some of the available devices.
However, prior to any of this discussion, section 4.1.2 
presents the Linear Feedback Shift Register (LFSR), a 
circuit which is extensively used for data generation and 
compression.
The chapter concludes with a generalised approach to BIT, 
the concept of self verification.
Many of the BIT techniques presented can either individ­
ually, or in combination make extensive use of self 
checking circuits at chip, board or system level, as they 
do in the error detection processor C4.1,4.23. These 
aspects are not covered here, but dealt with in subsequent
Chapters on self checking circuits and systems.
4.1 : MODES OF TESTING
Testing or checking can he implemented concurrently with 
normal operation (on line), or distinct from normal 
operation as an off line process, using the circuitry as 
it is, or in a special test mode of operation C4.163.
Fig. 4.1 shows the on and off line techniques for BIT 
considered in subsequent sections, together with their 
interactions. It also includes, for completeness, certain 
referenced aspects of on line testing for fault tolerance.
4.1.1 : On Line Testing
The need for on line testing originates from the require­
ments of fault tolerant computing, where ideally there is 
no computer down time due to failure. This is certainly 
true for critical space and aviation applications where 
faults must be detected, located*and either the erroneous 
data corrected, or the faulty module replaced with built 
in spares.
Concurrent checking therefore requires, in general, a 
significant hardware redundancy to achieve its goal with 
zero time overhead. This method of testing also has the 
ability to detect intermittent errors that occur during 
normal operation, which an off line test procedure may not 
detect.
Coding schemes or duplication and comparison are the most 
common forms of concurrent testing.
4.1.2 : Off Line Testing
Off line testing uses hardware and/or software to verify
that the operation of devices assembled as part of a
system or in isolation is correct. The only restriction 
✓
-55-
on testing time is that it falls within an acceptable 
limit.
Non concurrent testing allows all modules to be thorough­
ly exercised, a process which can locate failures not 
discovered during concurrent testing methods because nor­
mal operation does not create such an extensive stimulus.
Increased observability and controllability (see section
3.2) is often created by the test mode of operation 
whereby certain circuitry, particularly sequential logic, 
is reconfigured. Scan design is one such technique.
4.2 : THE LINEAR FEEDBACK SHIFT REGISTER
Mechanisms for generating a circuit stimulus and observing 
its response are essential to the success of any BIT 
technique. These are provided by effective and compact 
circuits for test pattern generation and data compression 
respectively. The most widely used device to provide both 
is the linear feedback shift register (LFSR), particularly 
in scan path design.
A series of D-type flip-flops configured with feedback 
paths, as shown in Fig. 4.2a, is known as a LFSR and can 
be used to provide a pseudorandom pattern generator. When 
the feedback paths are carefully selected, it can gen­
erate a long, near random, sequence of output patterns 
Nwith length 2 , where N is the number of flip-flops in
the shift register. Fig. 4.2b gives these patterns for 
the depicted LFSR. The philosophy of pseudorandom pat­
terns is that by applying a sufficiently large number of 
them to the circuit, acceptable testing will be achieved. 
The LFSR allows test patterns to be applied at high speed, 
so that several million can be achieved within an 
acceptable test time.
A relatively short test sequence can produce a considera­
ble amount of response data, since the state of each
output must be checked after the application of each 
stimulus. Data compression is therefore essential and 
can be achieved with the use of a LFSR plus an external 
input. This technique has already been covered in section 
2.1, as it is signature analysis. Fig. 4.3 shows a four 
stage LFSR for this purpose and the resultant signature 
from a sample data stream. However, since most circuits 
have more than one output, and to avoid having a separate 
LFSR for each of these, the basic LFSR can be expanded to 
accommodate more than one input, as shown in Fig. 4.4. 
This is then known as a multiple input signature register 
(MISR) and is obviously more economical than the basic 
LFSR, but can suffer from a worse error detection 
probability [4.313.
4.3 : ERROR DETECTING CODES
The most common form of BIT is coding. It provides an on 
line (concurrent) method of verifying circuit operation, 
using both information and hardware redundancy.
A set of lines, typically a bus, are encoded, ideally at 
their source, and the code subsequently checked at various 
points in the circuit, as shown in Fig. 4.5. The binary 
combinations on these lines will then only be a subset of 
all possible combinations during normal (fault free) 
operation. A failure in this or associated circuitry will 
(ideally) produce a value outside this allowed subset and 
will therefore be detected by the checking process.
This can be expressed more formally as follows C4.323.
For a particular code, the allowed binary combinations or
vectors are a subset S of a universe U chosen so that a
likely failure affecting vector X^ in S produces vector
X.' which is not in S. A codeword is a vector in S and a 
i
a non codeword a vector in the set U-S. If X^ is a
codeword and X^' a different vector produced by a failure
f, then f is a detectable error if X^' is a non codeword
and f an undetectable error if X.' is the same or another 
/ ^
-57-
codeword. These concepts are illustrated in Fig. 4.6.
The choice of code is dependent on the following:
1) The types of fault to be detected.
2) Whether logical or arithmetic processing is involved, 
since some codes are not preserved during these 
operations.
3) The added complexity to the uncoded circuit.
4) The cost of installation.
5) How easily the code is checked.
Codes essentially fall into two categories, separable and 
non separable. In a separable (systematic) code, the 
codeword is formed by the concatenation of additional 
(separate) check bits to the original data word. In a non 
separable (non systematic) code, the original data word is 
extended and modified to form the codeword.
4.3.1 : Single Parity Codes
Parity is one of the simplest coding techniques available 
and has been extensively used for many decades. In its 
basic form, a single parity bit is appended to an n-bit 
data word to produce odd or even parity. For even parity 
the parity bit is a '1* or a "O' such that the total
number of ones in the word (including the parity bit) is
an even number, whilst for odd parity the total number of 
ones is odd.
Expressed mathematically the parity bit P for an n-bit
word x-, . . . x^ is:1 n
Even parity: Peven = ©  x2 © --- ©  xn_1 ©  xR (4.1)
Odd parity: PQdd = xx ©  x2 © --- ©  xn_1 ©  xn (4.2)
This is illustrated in Fig. 4.7.
-58-
Considering odd parity, any fault which changes the number 
of ones from odd to even is detectable. Thus odd parity 
will detect all single bit errors as well as multiple bit 
errors which are odd in number. The same is true for even 
parity.
The code is checked with an Exclusive-OR (EXOR) tree, as 
shown in Fig. 4.8. An MSI package is available for this 
purpose C4.333.
Parity is used extensively for checking data transmission 
paths and memory, but is not, in general, preserved by 
arithmetic or logical operations, as indicated in Fig. 
4.9. If parity is to be used with arithmetic or logical 
operations, then the parity bit can be regenerated after 
the operation with the use of parity prediction techniques 
C4.34-4.363.
Overall parity creates the least redundancy (one bit) and 
is the cheapest code to generate and check.
4.3.2 : B-Adjacent Codes C4.37,4.383
Basic parity is ideally suited for memory where each 
device handles a single bit, for example, IK words x 1-bit 
RAM chips, since it will detect all failures within that 
device. However, if the memory device is four bits wide 
(for example, IK words x 4-bits) then an internal fault 
could cause a four bit error, which is undetectable by 
simple parity. In this case the b-adjacent code can be 
used. This is illustrated in Fig. 4.10, where the tech­
nique is also known as interleaved parity. An error in b 
adjacent bits is only a single error for each of the b 
parity check codes. The checker now consists of b x 
(k+1)-input EXOR trees and one b-bit AND gate.
4.3.3 : Duplication Codes
Where it is difficult to code the internal paths of a 
✓
-59-
circuit or device, then the easiest solution is to 
duplicate the whole circuit and compare the outputs of 
normal and duplicated modules to detect failures in each, 
as shown in Fig 4.11.
The advantage of this technique is that all failures 
(single and multiple) will be detected, except those which 
identically change the outputs of both modules. To 
overcome the possible occurrence of identical failures 
occurring in each module, Sedmak C4.14,4.3911 suggests that 
the duplicated module should be an inverse of the main 
circuit.
Duplication is a method of coding because when correspond­
ing outputs are compared, the input to the comparator for 
each pair will be <00> or <11> for modules giving 
identical outputs, the duplication code, and <01> or <10> 
for modules with complementary outputs, the l-out-of-2 
code. Thus errors are indicated by an input pair of <01> 
or <10> for the duplication code and <00> or <11> for the 
l-out-of-2 code. The l-out-of-2-code is preferred since 
it detects unidirectional multiple errors, such as those 
caused by a loss of power.
Duplication provides the highest fault coverage, but has 
the most redundancy and is the most expensive form of 
checking. Larsen C4.40D compares the redundancy of 
duplication with the redundancy of other coding forms.
4.3.4 : Checksum Codes C4.413
Checksum codes have been used extensively to detect 
errors in data transmission and data storage. Data is 
stored and transmitted in b-bit words. Appended to the 
end of each packet is a check word or checksum which is 
the binary sum of the words within that packet. The 
length of these packets may be anything from one to 
thousands of bytes.
-60-
The principle method of checking the code is software 
simply because of the vast amount of data involved. 
However, hardware detection of errors in individual words 
may be performed by breaking the word down into k x b-bit 
bytes and applying the technique to these bytes, as shown 
in Fig 4.12.
An extra b-bit byte is required for the checksum which 
will detect all faults within a single byte. The checker 
consists of (k-1) x b-bit binary adders and a b-bit 
comparator.
Checksums can also be incorporated into arithmetic opera­
tions with the use of checksum prediction C4.423.
Checksums provide the same fault coverage as the b- 
adjacent code in Fig. 4.10, but with a checker which is 
more expensive and slower.
4.3.5 : AN Codes C4.433
The AN code is an arithmetic code C4.44,4.453, that is to 
say, unlike parity, is preserved during arithmetic opera­
tions. The code is described as non systematic and non 
separable.
In the AN code an uncoded word X is multiplied by a check 
base A to form the codeword AX. Each codeword is formed
Thus the encoding of the sum of two numbers is the sum of 
their respective encodings. When no fault occurs, the 
adder output is a multiple of A. A checker monitors this 
output and produces an error signal when the residue or 
remainder is not zero. This structure is illustrated in 
Fig. 4.13, together with a numerical example. All faults
by appending bits to X. Binary addition of
codewords is performed so that:
AX + AY = A (X+Y) (4.3)
-61-
producing an output which is not a multiple of A will be 
detected. These faults include most unidirectional multi­
ple errors.
In this general case, the checker must carry out long 
division by A to obtain the residue which is a slow and 
complex hardware algorithm. However, this process can be 
greatly simplified if the check base A = 2^-1, where b is
an integer greater than one C4.463. This is known as the
low cost AN code because the checker now greatly simpli­
fies to an adder tree of b-bit modulo 2^-1 adders plus an 
AND gate. A sample structure together with a numerical 
example is given in Fig. 4.14. Note that:
i) The modulo 2^-1 adders are ordinary binary adders with
end around carry, 
ii) The correct residue is now all l's (all 0's would be 
produced by normal long division), so the final check 
of the residue might need to take into account all 0's 
or all l's as being correct.
All errors within a b-bit byte can be detected except 
those of magnitude 2^-1. This form of coding has similar 
redundancy and checker costs to the checksum code, but may 
be a little slower. The AN code is not suitable for
logical operations since it is non separable.
4.3.6 : Residue Codes C4.473
The basic residue code is, again, an arithmetic code and 
similar to the AN code except that it is separable. A
word X is encoded by appending check bits C(X) to it which 
are the residue of X divided by a check base A. The
number of check bits required is given by flog2Aj . The
encoding of the sum of two numbers is the binary sum of 
the original data words, say X and Y, and the independent 
summation of their respective check bits modulo A. The 
encoding of the sum of X and Y, f(X+Y), is then:
-62-
f(X+Y) = <C(X) +A C(Y), X+Y> (4.4)
where +, is addition modulo A and is concatenation.A
A checker which compares C(X+Y) with (C(X) +A C(Y)) will 
then be used to detect errors in the addition, as shown in 
Fig. 4.15. All faults will be detected except those which 
produce an all 0's or all l's residue.
As with the AN code, a low cost residue code can be 
produced by using A = 2^-1, where b is again an integer 
greater than one. This is now beneficial to both the 
generation and checking of the code. A sample structure 
and numerical example are given in Fig. 4.16. The same 
fault detection capabiltiy, redundancy and checking costs 
apply as for the AN code.
4.3.7 : M-out-of-n Codes C4.483
M-out-of-n codewords are a subset of all n-bit words where 
exactly m of its bits are a - logic 1. The code is 
therefore termed a fixed weight code. The number of 
possible combinations for m bits out of an n-bit word is 
given by:
Cn = n!/m!(n-m)i (4.5)m
Although m-out-of-n codes are non separable and non 
systematic, and therefore not suitable for processing 
systematic data, they are suitable for encoding control 
information, where the individual bits of a word have 
specific functions rather than a numerical representation. 
In some cases, for example, memory address decoding, they 
occur naturally. They can detect all unidirectional 
multiple errors, except faults which change one codeword 
to another codeword.
A maximum number of codewords occurs when m equals the 
integer part of n/2 : m = . Using this particular
-63-
case and an 8-bit data word, consider the following. If
8the word is encoded using simple parity, then 2 = 256
codewords are provided with one redundant bit. A similar 
number of codewords (252) are available from a 5-out-of-10 
code which requires two redundant bits. So, compared with 
parity, n/2-out-of-n codes, better known as k-out-of-2k 
codes, require slightly more redundancy for much better 
fault coverage.
4.3.8 : Berger Codes C4.49-4.523
Berger codes are optimal separable (systematic) codes, 
where an n-bit word X is encoded by the addition of k 
checkbits. The checkbits are the binary representation of 
the number of 0's in X, XO, or the binary representation 
of the number of l's in X complemented (bit by bit), XI. 
This results in k check bits, where k = r the
number of binary bits needed to represent n. XO and XI
It
will only be identical when n = 2 -1, when a maximal 
length Berger code is produced.
Berger codes will detect all unidirectional errors and are 
optimal, in terms of the number of check bits required for 
n information bits, amongst all separable codes that 
detect unidirectional errors. However, they are more 
redundant than m-out-of-n codes, which also detect uni­
directional errors, but Berger codes have the separability 
advantage which simplifies and minimises encoding and 
decoding hardware.
The checker consists of a series of full adder modules to 
add the information bits in parallel. Fig. 4.17 shows a 
structure for n=7 and k=3, together with a numerical 
example.
4.3.9 : Cyclic Codes C4.533
Binary cyclic codes have the property that any cyclic
shift of a code word is also a code word. The encoding
✓
-64-
and decoding processes for these codes are straight­
forward with the use of linear feedback shift registers 
(see section 4.2), so the codes are ideal for checking 
serial data.
An n-bit binary number may be represented as a polynomial 
M(x) thus:
—  1 y \ 9
M(x) = m ,x + m nx * + .... + m,x + mA (4.6)n-1 n-2 1 0
Two forms of cyclic code exist; systematic or non syste­
matic.
4.3.9.1 : Non Systematic Cyclic Codes
A code polynomial V(x) is formed from the product G(x)M(x)
V(x) = G(x)M(x) (4.7)
where G(x) is a suitable generator polynomial of the form
G(x) = grxr + Sr_1xr~1 + .... + % i x  + (4.8)
For an k-bit encoded word r = k-n (n is the number of
information bits).
This code is non systematic since the information bits are 
scrambled in the polynomial product. Fig. 4.18 gives an 
example.
4.3.9.2 : Systematic Cyclic Codes
V(x) is formed by appending the inverse of the remainder
ITwhen x M(x) is divided by G(x) as check bits to the 
original information bits.
xrM(x)/G(x> = Q(x) + C(x)/G(x) (4.9)




V(x) = xrM(x> - C(x) = Q(x)G(x) (4.11)
where: r = k-n as above.
Q(x) and C(x) are polynomials, with the highest 
power of x in C(x) being less than r.
Thus V(x) is a multiple of G(x). Fig. 4.19 gives an 
example.
These cyclic codes are referred to as (k,n) codes. They 
will detect all single errors and all burst errors that 
affect k-n (r) or fewer adjacent bits. The cyclic codes 
can also be used for error correction.
Both the encoding and decoding of systematic cyclic codes 
requires the use of division by G(x):
Encoding: xrM(x)/G(x) and use the inverse of the remain­
der as check bits.
Decoding: V(x)/G(x) and if the remainder is zero it is a 
codeword.
A linear feedback shift register (LFSR) can be used to 
serially divide by G(x). Fig. 4.20 shows such a structure 
where G(x) is that used in the above example.
4.3.10 : Hamming Codes C4.12,4.543
The Hamming codes extend the principles of single parity. 
They are not only able to detect errors, but also able to 
correct them as well. In these codes more than one parity 
bit is appended to a data word X, such that each added bit 
represents the parity of a different subset of the n 
information bits.
Codes are e-error detecting if any fault causing up to e 
erroneous bits can be detected and f-error correcting if
-66-
f erroneous bits can be corrected to produce the original 
fault free data.
The Hamming distance d of a code is the minimum number of 
bits between two codewords. Fig. 4.21 shows the inter­
relation of d,e and f derived from coding theory C4.123. 
For simple parity d=2, which allows single error detec­
tion and zero error correction. Since the Hamming codes 
provide error correction, they must have d greater than 2.
Consider the case for a single error correcting code 
C4.123. To be single error correcting a data word of n 
information bits requires c check bits, where
2C > n+c+1. (4.12)
For an overall b-bit word, the check bits ideally occupy
positions b q, b b .....  b c_2«
2 2 2 2
As an example, let n=4 such that (4.12) gives c > 3. Take 
c=3 and a seven bit word is required, .... b^, with the 
check bits occupying b^, b2and b^.
The values of the check bits are determined from the 
parity check equations:
b4
such that : P3 = b? ©  b6 ©  b5 ©  b4 = 0 (4.13)
**2 such that : P2 = b? ©  b& ©  b3 ©  b2 = 0 (4.14)
such that : PI = b? ©  b5 ©  b3 ©  bx = 0 (4.15)
These are even parity equations, so an even number of l's 
is required to satisfy the left hand side of each 
equation.
If the information bits have values ^3 = 1* ^5=0,
and b^ = lf then the check bits have values b^=0, an(*
b^=l, so that the encoded word is 1001100.
If a single bit error occurs in bit b^ the word becomes
-67-
1011100. When checking occurs, P3 = l, P2=0 and P1 = 0. It 
can be seen from the parity check equations that if the 
outputs P3, P2 and PI are concatenated in this order, then 
they represent the binary number of the erroneous bit, in 
this case 101, i.e. b^, which can be corrected.
Consider now a fault in bits b^ and b ^ r  i.e. a double 
fault. The modified word becomes 1011110, with PI, P2 and 
P3 all equal to 1. Accordingly, bit 7 will be corrected 
producing 0011110. This demonstrates that although the 
double bit error has been detected, it cannot be correc­
ted. In fact, an incorrect correction will be made. Fig. 
4.22 shows pictorially why this occurs.
A more typical information word of eight bits requires at 
least four check bits, from (4.12). In this case, it is 
equally redundant as a Berger code with n=8, but has twice 
as many redundant bits for n=8 in an equivalent k-out-of- 
2k code (5-out-of-10). However, whilst the k-out-of-2k 
code can detect all unidirectional errors it has no error 
correction capacity 2), whereas the Hamming code
considered has single error correction and double error 
detection.
The checking requirement for this Hamming code is c parity
c—1
trees, each with up to 2 inputs (c is the number of 
check bits).
Practical implementation of the Hamming code in micro­
processor based systems is presented by Heimlich C4.553 
and Wall C4.133.
4.4 : SCAN DESIGN
Scan design is a technique developed by IBM which involves
the reconfiguration of storage elements. It exists in two
main forms; Scan Path design using conventional storage
elements and Level Sensitive Scan design using special




4.4.1 : Scan Path Design 114.31,4.56-4.583
In scan path design all storage elements (flip-flops) have 
additional circuitry as indicated in Fig. 4.23. This 
additional circuitry is essentially a 2 to 1 multiplexer 
in front of the D input, with normal data and scan data 
as the two inputs.
During test, this circuitry enables the storage elements 
to be reconfigured serially into a shift register, hence­
forth termed the scan register, as shown in Fig. 4.24. 
The scan data out from one storage element is inherently 
connected to the scan data input of the next storage 
element and it is this data path which is selected for 
test purposes by the scan enable line. A sequential 
circuit is then reduced to a shift register and combina­
tional logic, so that test patterns are required for the 
combinational logic only (a comparatively easy task com­
pared with test generation for sequential logic). These 
are applied using the normal input pins and the access 
path provided by the scan register. If, for example, a 
LFSR data generator and LFSR data compressor are added 
with appropriate control circuitry, then the test setup 
becomes that in Fig. 4.25.
For Fig. 4.25 testing is as follows:
1) Reset the data generator and the data compressor and 
then enable the data generator.
2) Set mode to test. This reconfigures the storage 
elements into a scan register.
3) Clock the circuit until the scan register has been
completely loaded with data (via scan in). This is
the first test for the combinational logic.
4) Set mode to normal and set up primary input data.
5) Clock the circuit once.
6) Set mode to test, latching primary output data.
✓
-69-
7) Enable the data compressor (first time only) and shift 
out the contents of the scan register into it, whilst 
simultaneously loading the next test pattern.
8) Repeat steps 4)-7) until the test generator has sup­
plied all possible test patterns.
9) Inhibit data generator and data compressor.
10) Compare the signature held in the data compressor with 
a reference value and generate a pass/fail indication.
4.4.2 : Level Sensitive Scan Design C4.58-4.633
The level sensitive scan design (LSSD) technique is 
similar to the scan path, except that the storage element 
is a shift register latch (SRL) which uses level sensitive 
latches in a master-slave configuration, as shown in Fig. 
4.26. This circuit is nearly independent of hard to 
control ac characteristics such as rise time, fall time 
and overall delay for correct operation and is hazard/race 
free. However, there are restrictions on when and how 
signals can change C4.593.
The scan data out is now a separate line, but still 
connects to the scan data in of the next unit to form the 
scan register, which is now operated from a two phase non­
overlapping clock C4.59-4.613.
LSSD is used extensively in the IBM System/38 C4.60,4.613.
4.4.3 : Advantages and Limitations of Scan Design 
Advantages:
1) Sequential logic is transformed into a combinational 
form, allowing well understood test generation tech­
niques such as the D-Algorithm or Boolean Difference 
C4.64,4.653 to be used.
2) It provides a vast improvement on the controllability 
and observability of internal nodes compared to a
circuit without this technique.
✓
-70-
3) It can be used for chip, board and system testing 
during production and in the field.
4) The additional circuitry creates little performance 
degradation during normal operation.
Limitations:
1) Each storage element is two to three times logically 
more complex than a standard unit (although the 
additional latches required for LSSD can be used for 
other system functions).
2) Up to four more I/O pins are required.
3) External control and clock circuitry is required.
4) External inputs will have to be precisely controlled 
to generate the correct response (signature).
5) It provides only a dc test of the system.
6) The serial nature of the data can result in long test 
times.
Scan path can be added to conventional designs C4.573, 
whilst LSSD is included during design.
The above assumes that all the storage elements in a 
circuit will be serially connected to form a shift 
register, giving a complete scan path, but Trischler 
C4.663 considers the effect of not including every 
element, which results in an incomplete scan path.
4.4.4 : STUMPS
The concepts discussed above may be used for circuits 
within a single device or several devices cascaded. 
However, for multiple device applications, self test using 
MISR and a parallel shift register sequence generator 
(STUMPS) is an extension of these ideas, as shown in Fig. 
4.27 C4.313.
The scan inputs are fed from the parallel output of a LFSR
pseudorandom sequence generator, whilst the scan outputs 
✓
-71-
feed the parallel inputs of a MISR via a set of control 
gates. The control gates allow the test response from a 
device or devices to be ignored in order to isolate an 
incorrect signature to one particular chip. Obviously
additional known signatures will be required for this 
purpose.
Using this technique, a long shift register has been 
broken down into a number of smaller units operating in 
parallel, so the test time will be significantly reduced.
4.5 : THE BUILT IN LOGIC BLOCK OBSERVER f?'*'fan■ ...... ........... - 4.67-4.69D
THE Built In Logic Block Observer (BILBO) combines many of 
the elements used in scan design to provide a self testing 
structure. In a similar manner to the scan path technique, 
the BILBO is formed from reconfigured storage elements
within a network, plus a number of additional gates.
It consists of latches with feedback paths via EXOR gates, 
which are typical of LFSRs. Fig.* 4.28a shows a typical
BILBO structure which has parallel inputs from combina­
tional logic Zi, parallel outputs Qi,a serial input SDI, 
a serial ouput SDO and two control inputs B1 and B2.
B1 and B2 select four operating modes:
Mode 1 : B1=B2=1 - Fig. 4.28b
This is the normal mode of operation. The circuit
operates as a parallel latch with data loaded at the Z 
inputs on a transition of the clock.
Mode 2 : B1=B2=0 - Fig. 4.28c
The circuit becomes a shift register. Data present at 
SDI will be shifted through the four latches, appearing 
at Zi and SDO.
Note that the combination of Modes 1 and 2 provide a scan 
register.
-72-
Mode 3 : Bl=l, B2=0 - Fig. 4.28d
The circuit now becomes a MISR. It can be used to
compress test responses or, for constant Zi, generate 
pseudorandom data streams at Qi.
Mode 4 : B1=0, B2=l - Fig. 4.28e
On a clock transition all latches will be reset (0 
applied to all inputs).
The BILBO is thus an incredibly versatile building block 
for the construction of self testing circuits, as indi­
cated in Fig 4.29. Alternatively, it can be used to
assist other BIT techniques. The scan path structure of 
Fig 4.25, for example, can be modified to include a BILBO 
which will perform the data generation and data com­
pression functions simultaneously, such that the response 
of the circuit to one test influences the next test, as 
shown in Fig 4.30. Once a failure has been detected, the
other two modes of operation allow the circuit to perform
as a conventional scan register for further diagnosis.
Fasang and Konemann C4.67-4.693 detail experimental re­
sults and usage for the BILBO, allowing BIT at normal 
system speed and the high fault resolution quality of scan 
path techniques.
4.6 : AUTONOMOUS TESTING C4.62,4.70-4.743
Scan design provides a powerful testing mechanism, but
introduces additional complexity into the design process
and requires test pattern generation (TPG) C4.64,4.651
determined by the type of faults to be detected (fault
modelling). Scanning patterns in and out of the network
means a long testing time, especially for large circuits.
Even if pseudorandom patterns are used to eliminate TPG
(as prevously described), then the test time will be
substantially increased, since there must be sufficient
patterns to provide the desired fault coverage.
/
-73-
Autonomous testing does not require fault modelling or 
TPG. All possible input combinations are applied to 
combinational logic and all sequences to sequential logic, 
with outputs monitored for correct operation. This is 
known as exhaustive testing, which, in general, is a 
lengthy process. However, the testing time can be
substantially reduced by partitioning the circuit under 
test into subcircuits which have a relatively small 
number of inputs compared to those of the complete 
circuit. Williams L 4.62,4.633 indicates that testing 
time is related to the cube of the complexity (number of 
gates) in the circuit.
Requirements for this technique are:
1) Mechanisms to partition the circuit under test into 
subcircuits.
2) Additional circuitry, or reconfiguration of existing 
circuitry, to provide all input combinations to the 
subcircuit.
3) Additional circuitry, or reconfiguration of existing 
circuitry, to verify the response of the subcircuit.
As with many of the BIT techniques, the circuit functions 
in either a normal or a test mode of operation. The test 
mode of operation is as described above. All additional 
test circuitry is transparent to the inputs and outputs 
during the normal mode of operation. The transformation 
from the normal to the test mode of operation is presented 
in Fig. 4.31.
Parts 2) and 3) of the strategy above are implemented with 
the now familiar LFSR, modified so that the generator can 
provide all output combinations, since the LFSR in Fig.
4.2 does not generate the all zero state. A circuit 
proposed by McCluskey (MC) in C4.733, and given in Fig. 
4.32, closely resembles the BILBO already described.
-74-
In order to exhaustively test each subcircuit, the par­
titioning must allow all its inputs to be controllable and 
all its outputs observable.
Two approaches are presented for partitioning by MC 
C4.733. These are the use of multiplexers and sensitised 
partitioning.
4.6.1 : Partitioning with multiplexers
The use of multiplexers is demonstrated in Fig. 4.33. It 
shows a module G broken down into two subcircuits G1 and 
G2 (Fig. 4.33a); how multiplexers are added to create the 
subcircuits (Fig. 4.33b); how the multiplexers are con­
figured for normal operation (Fig. 4.33c) and how the 
multiplexers are configured to test subcircuit G1 (Fig. 
4.33d). Bozorgui-Nesbat (BN) C4.723 indicates that a 
stuck-at fault on any line, including the testing cir­
cuitry, will be detected, whilst MC C4.733 warns against 
failures affecting normal operation only. BN C4.723 also 
details a generalised approach bo partitioning in this 
manner. MC C4.733 describes a practical implementation of 
this technique for a 74181 ALU/Function Generator C4.693 
and compares the number of tests required with and without 
the partitioning. Buehler C4.763 carries out a similar 
operation for a full/half adder combination.
However, this method of partitioning involves an addition­
al overhead to provide the multiplexers and their control 
circuitry. This overhead is dependent on the size and 
complexity of the network and could be significant. The 
multiplexers also reduce the speed of normal operation.
4.6.2 : Sensitised Partitioning
MC C4.733 suggests that the effect of multiplexer inser­
tion should be created using sensitised partitioning. 
Here circuit partitioning and subcircuit isolation are 
achieved by applying an appropriate pattern to some of the
-75-
input lines. As an example, consider the circuit G in 
Fig. 4.34a divided into the two subcircuits G1 and G2 of 
Fig 4.34b. These subcircuits can be tested as follows:
1) To test G1 : Put D1=D2=0, then E1=C1 and E2=C2 (outputs
of G2=outputs of Gl>. Apply all binary combinations 
to CA1,A2,B1,B23.
2) To test G2 : Put B1=B2=1, then C1=A1 and C2=B2 (inputs
to G2=inputs to Gl). Apply all binary combinations 
to £A1,A2,C1,C21.
The test configurations for Gl and G2 are shown in Figs. 
4.34c and 4.34d respectively. Each subcircuit effectively
4
has four inputs and therefore requires 2 =16 test pat­
terns , so a total of 32 tests are required to test both
0
subcircuits,, compared with 2 =64 if the circuit is tested 
as a single unit. MC C4.733 again applies the technique 
to the 74181 ALU/Function Generator. In addition, MC 
suggests how to implement the pattern generator and res­
ponse verifier within the same device.
Unlike the insertion of multiplexers, sensitised parti­
tioning is heavily dependent on circuit structure, but 
requires less additional hardware overhead.
MC £4.703 extends the above principles for autonomous test 
to sequential circuits.
4.7 : SERIAL SHADOW REGISTER £4.583
Serial shadow register (SSR) diagnostics is an on-chip 
technique which allows complex systems to be easily 
controlled and observed.
The structure, shown in Fig. 4.35, uses 'output' and 
'shadow' registers to initialise and test combinational 
and sequential circuits. External control circuitry 
allows the data transfer between the shadow register and
-76-
the circuit under test to be bidirectional. Fig. 4.36 
details the various operations of the SSR structure.
The output and shadow registers are both connected to the 
D and B buses. The shadow register has either serial or 
parallel input data. Serial data arises from test pat­
terns shifted in, or test responses shifted out, i.e. scan 
data as in the scan path technique, whilst parallel data 
is loaded from either the B bus or the output register. A 
2 to 1 multiplexer allows the output register to be loaded 
from either the D bus or the shadow register. It can thus 
obtain serially input test patterns from the shadow 
register, or have its contents (test responses) copied to 
the shadow register to be serially output. Normal 
operation can continue immediately after system states 
have been read from the output register, unlike the Scan 
Path and BILBO techniques, in which these states have to 
be reloaded as they are destroyed by the read process. In 
addition, switching to the test mode does not alter system 
states in the output register, this only happens when the 
test pattern is loaded.
DCLK and CLK, generated from the system clock, clock the
shadow and output registers respectively, whilst the mode 
and SDI inputs control data loaded into the registers from
external sources. Together these four lines control the
on-chip diagnostics, as detailed in Fig. 4.36. The two
clocks increase test speed since test patterns can be 
loaded (or test reponses unloaded) during normal output 
register operation.
Fig 4.37 shows a typical application of the SSR technique, 
where SSRs replace all ordinary and I/O registers.
Overall, the technique allows a great deal of flexibility 
in the test process and although it requires more hard­
ware than other techniques (shadow registers, mutilplexers 
and control circuitry),its superiority is highlighted by a 
comparison wibh the Scan Path and BILBO structures C4.58D.
-77-
4.8 : OTHER BIT TECHNIQUES
The five techniques considered so far are the most widely 
used forms of BIT. However, there are a number of other 
techniques which will be briefly mentioned for complete­
ness.
4.8.1 : Scan Set Logic
A technique proposed by Sperry Univac which combines the 
principle of scan path and SSR to provide a scan register 
independent of system latches and thus not part of any 
system data path C4.623.
4.8.2 : Random Access Scan
Random access scan is similar to the scan path technique 
except that shift registers are not employed. Instead, an 
addressing scheme allows each latch to be selected for 
control and observation purposes* C4.623.
4.8.3 : Syndrome Testing
All 2n patterns are applied to an n-input combinational 
circuit and its syndrome compiled C4.62,4.773. The syn­
drome is essentially the number of l's appearing at the 
output of the circuit during the test, which is then 
compared with its correct value to locate a faulty 
network. However, some circuits need to be modified so 
that they are syndrome testable for stuck-at faults. 
Savir C4.773 demonstrates this technique for the 74181 
ALU.
4.8.4 : Walsh Coefficient Testing
Again, this technique requires the application of all 
possible input patterns to a combinational network, in 
which two of its Walsh coefficients are checked to de­
termine the presence of stuck-at faults 114.62,4.78-4.803. 
Modification of the network might also be neccessary so 
that it is testable by this method.
4.8.5 : ROM Based Test Patterns and Responses
Instead of using test patterns from an external source or 
a LFSR, patterns are stored in a special test ROM C4.813. 
Test responses are compared with expected responses stored 
in another ROM.
4.8.6 : Self Oscillation
During its test mode of operation, the circuit is recon­
figured so that it will oscillate only if it is fault free
C4.763. Specific input combinations are required to 
create this condition.
4.8.7 : Self Comparison
In a self comparison scheme, the* circuit under test is 
partitioned into two subcircuits which produce identical 
outputs for a given set of inputs C4.76,4.813. Checking
then simply involves a comparison of the corresponding
outputs in each subcircuit.
4.8.8 : History Memory
Not strictly a BIT technique perhaps, but worth a mention 
as it is ideally suited for computer applications C4.823. 
A large quantity of memory records particular or all bus 
transactions for subsequent analysis, so that hardware and 
software failures can be located. In principle, it is 
similar to a logic state analyser.
4.8.9 : Reed Muller Canonical Representation
An n-input combinational circuit can be implemented in 
Reed Muller canonical form thus C4.833:
-79-
f (X1 xn J C0 ®  Clxl ®  C2X2 ® ..... ®  Cnxn ®
(4.13)
This form requires, at most, (3n+4) tests to detect all 
single stuck-at faults, including input faults C4.833.
4.8.10 : Technique Comparisons
Overall, certain papers make useful comparisons between 
techniques in terms of additional hardware requirements, 
fault coverage, test time, pattern generation and test 
response mechanisms, testability of the test circuitry, 
VLSI applicability and so on.
Buehler C4,763 compares self oscillation, self comparison, 
scan path and BILBO; Bhavsar C4.813 compares duplication, 
residue coding, ROM based test patterns and responses, 
autonomous testing and self comparison; whilst Gupta 
C4.583 compares the SSR, BILBO and scan path techniques.
4.9 : INTEGRATED CIRCUITS WITH BIT
A number of integrated circuits which either contain BIT, 
or assist BIT, are already commercially available, with 
hopefully many more to come.
Some of these devices are reviewed below, with reference 
to their function and their BIT.
4.9.1 : Motorola MC6805 : single chip 8-bit microprocessor
The 6805 has 116 bytes of on-board ROM and control 
circuitry which allow a self check mode of operation 
C4.84-4.863. When certain external requirements have been 
met, the self check mode of operation can be initiated. 
The internal program then runs repeatedly until it is 
manually terminated or it discovers a fault. It firstly
-80-
tests I/O lines on which a test status is emitted, 
proceeding to test the other I/O lines, RAM, ROM and 
interrupts.
4.9.2 : Motorola M68000 : 16-bit microprocessor
The 68000 does not strictly have BIT, but it does have 
exception processing - processing for exceptional condi­
tions C4.87,4.883. These exceptions can be generated 
internally or externally and include divide by zero, 
odd address accesses, illegal instructions, bus errors, 
privilege violations and spurious interrupts. The ex­
ternally generated bus error can also cause the processor 
to re-run the bus cycle in which the error was detected. 
If a double bus error occurs during exception processing, 
then the processor is halted. Also, a hardware trace 
faciltity is provided which causes exception processing 
after each instruction is executed, allowing a debugging program 
to monitor main program execution.
4.9.3 : Intel iAPX 432 : 32-bit microprocessor
The iAPX 432 consists of a two-chip general data processor 
(GDP), interface processor, bus interface unit (BIU) and 
memory control unit C4.89,4.903. It is aimed at multi­
processor applications and features a functional redundan­
cy checking mode for hardware error detection. When the 
GDP devices are in a checking mode, they compare certain 
of their I/O lines with an internally generated result, 
signalling an error to the BIU. The BIU then serially 
transmits the type and location of the error to all other 
nodes in the system. Redundancy in this error reporting 
mechanism ensures that each node will receive the same 
correct error report. Each node logs the information 
and proceeds with an appropriate recovery procedure.
4.9.4 : Monolithic Memories 74S818 : Diagnostic Register
The 74S818 diagnostic register is a direct implementation 
/
-81-
of the SSR technique and contains the circuitry detailed 
in Fig. 4.35 C4.913.
4.9.5 : Monolithic Memories 63DA1643/841/441 : PROMs
This diagnostic PROM family consists of devices with 
arrays of varying depths, but all 4-bits wide C4.91,4.923. 
Each has a built in 4-bit SSR. Applications include a 
microprocessor control store with built in system diagnos­
tic test, as shown in Fig. 4.37, serial character 
generation or serial code conversion.
4.9.6 : National Semiconductor SLX6360 : Logic Array
The SI.X6360 is a 6000 gate semicustom logic array which 
has a BILBO C4.933. A 189-bit pseudorandom number genera­
tor, which is externally seeded, can deliver 20M vectors 
per second. The test responses are fed to a checksum 
register which generates an 88-bit signature.
4.9.7 : National Semiconductor DP8400 : Error Corrector
The DP8400 provides error correction and detection within 
one chip C4.943. Each chip handles 16 data bits, 
generates check bits using a modified Hamming code,
indicates and corrects single and double bit errors and
indicates triple bit errors. It can also differentiate
between transient and permanent failures. In addition to 
all of this, it can generate and check single bit byte 
parity and allows access to all of its internal gates for 
diagnostic purposes.
4.9.8 : LSI Logic LSA2000 : Structured Array Family
In addition to logic gates, the LSA2000 family of
structured arrays have various combinations of RAM, ROM
and/or special purpose megacells C4.953. They also
contain multiple scan paths which allow every memory bit
and every megacell transistor to be tested.
✓
-82-
4.9.9 : Hitachi : Gate Arrays
Hitachi gate arrays also include multiple path scan 
design, referred to as a scan bus structure C4.963. 
Master-slave latches are employed, similar to those in 
LSSD, although they are not level sensitive.
4.10 : SELF VERIFICATION
This section outlines the proposals presented by Sedmak 
C4.14,4.39,4.973 for self verification (SV). These pro­
posals provide an excellent focus for all the various on 
or off line testing forms of BIT considered in previous 
sections, hence their inclusion.
Sedmak suggests a somewhat ideal, but generalised approach 
to BIT, which merges on and off line testing so that 
verification of a fault free chip, board or system can be 
made without the use of external test equipment and with 
logic operating at its normal speed.
The overall scheme is shown in Fig. 4.38 and consists of 
the following:
1) A combinational/sequential partition or subcircuit of 
the complete circuit.
2) Internal Stimuli Generators (ISGs) which provide 
stimuli to thoroughly exercise the logic and expose 
any fault. They do not use stored test patterns or 
require test pattern generation.
3) An ISG Supervisor (ISGS) which coordinates the opera­
tion of all ISGs.
4) Fault Detection Circuits (FDCs) which monitor in­
termediate as well as final outputs, so that inter­
mittent or solid faults are immediately detected.
5) An Error Status Generator (ESG) which collects in­
formation from the FDCs and encodes it to provide 
error signals which indicate the presence and location
-83-
of failures.
Both the ISGs and ISGS operate during verification and not 
during normal operation. The FDCs and ESG operate con­
tinuously during normal operation and self verification.
In design for self verification the following need to be 
considered:
1) The size and complexity of each partition.
2) The quantity and position of FDCs and ISGs, versus the
complexity and position of the ISGS and ESG.
3) Detailed designs for FDCs, ISGs, ISGS and ESG.
In addition, it is desirable to:
a) Minimise the SV cost overhead.
b) Minimise the degradation of performance during normal
operation.
c) Maximise the speed of the SV process.
d) Maximise fault coverage during both modes of opera­
tion.
Using the constraints of a) to d), Sedmak C4.973 expands 
points 1) to 3), giving extensive guidelines for each 
(although little attention is given to the ESG). He also 
applies the design for verification process to a 32-bit 
ALU.
4.11 : CONCLUSIONS
Overall, it is considered by the author that the most 
important aspect of BIT and design for verification is the 
use of on line fault detection. If this forms the sole 
basis for testing and verification of microprocessor based 
systems then the following are achieved:
1) The system functions only in its normal mode of




2) Solid and intermittent faults are detected immediate­
ly.
3) No external test equipment is required.
4) No TPG or stimulus is required, since the circuit is
continuously supplying its own.
Faults that affect normal system operation need to be
detected, located and cause actions such as:
a) Notification to the outside world.
b) Error correction.
c) System halt or shutdown.
d) Operation retry C4.14,4.29,4.303.
There also has to be mechan-isms for fault detection. 
These are often coding techniques, which include duplica­
tion. The fault detection circuits are then code checkers 
as previously described (section 4.3). A problem then 
arises as to what happens if a failure occurs in the 
checking circuit. Is there a checker checking the 
checker, another checker checking that checker and so on 
ad-infinitum? Fortunately, the answer is no and the prob­
lem is resolved with the use of self checking checkers. 
Despite the presence of an internal fault, these circuits 
either produce a correct output or indicate that the 




4.1) A monolithic self-checking error detection process­
or - J. Chavade, M. Vergniault, P. Rousseau, Y. 
Crouzet, C. Landrault;1980 Test Conf.*;pp. 279-286.
4.2) Design specifications of a self-checking detection 
processor - Y. Crouzet, C. Landrault; FTCS-10*; pp. 
275-277.
4.3) Real-time fault diagnosics multiple microprocessor 
systems - K. Fu, G. M. Flachs; 1981 IEEE Real Time 
Systems Symposium; December 1981; pp. 56-61.
4.4) Microprogrammed control and reliable design of 
small computers - G. D. Kraft, W. N. Toy; Prentice- 
Hall Inc., Englewood Cliffs, New Jersey, USA; 1981; 
pp. 362-365.
4.5) Design of a microprogram control for a processor 
in an electronic switching system - T. F. Storey; 
Bell Syst. Tech. J.; February, 1976; pp. 183-232.
4.6) Error detecting codes, self checking circuits and 
applications - J. F. Wakerly; Elsevier - North 
Holland; New York; 1978; pp. 3-5.
4.7) Diagnosis and reliable design of digital systems 
- M. A. Breuer, A. D. Friedman; Pitman Publishing 
Ltd., London, 1977; pp. 274-276.
4.8) Fault Detection in digital circuits - A. D. Fried­
man, P. R. Menon; Prentice-Hall Inc., Englewood 
Cliffs, New Jersey, USA; 1971; pp. 193-202.
4.9) Error correcting codes for reliable digital cir­
cuits - I. S. Reed; Infotech*; pp. 713-745.
4.10) Error-correcting codes with byte error-detection 
capability - C. Chen; IEEE Trans. Comput.; Vol. C- 
32, No. 7; July, 1983; pp. 615-621.
4.11) A new class of error-correcting/detecting codes for 
fault-tolerant computer applications - D. K. Prad- 
han; IEEE Trans. Comput.; Vol. C-29, No. 6; June, 
1980; pp. 471-481.
4.12) Error detecting and error correcting codes - R. W. 
Hamming; Bell Syst. Tech. J.; Vol. 29, No. 1; 
January, 1950; pp. 147-160.
4.13) Applying the Hamming code to microprocessor based 
systems - E. L. Wall; Electronics; November 22, 
1979; pp. 103-110.
4.14) Fault-tolerance of a general purpose computer 
implemented by very large scale integration - R. M. 
Sedmak, H.L. Libergot; FTCS-8*; pp.137-143.
- 8 6 -
4.15) Designing reliable computer systems the fault
tolerant approach - 1 - R. G. Bennetts; IEE Elec­
tronics and Power; November/December, 1978; pp.
846-851.
4.16) Designing reliable computer systems the fault
tolerant approach - 2 - R. G. Bennetts; IEE Elec­
tronics and Power; January 1979; pp. 51-56.
4.17) Fault tolerance and digital systems - R. G. 
Bennetts; Microprocessors and Microsystems; Vol. 3, 
No. 8; October, 1979; pp. 365-373.
4.18) Fault tolerant computing: Techniques and develop­
ment - A. Avizienis; Infotech*; pp. 309-333.
4.19) Architectural design for near - 100% fault coverage 
- J. J. Stiffler; FTCS-6*; pp. 134-137.
4.20) A study of standard building blocks for the design 
of fault-tolerant distributed systems - D. A.
Rennels, A. Avizienis, M. Ercegovac; FTCS-8*; pp. 
144-149.
4.21) The STAR ( self-testing and repairing) computer: An 
investigation of the theory and practice of fault- 
tolerant computer design - A. Avizienis, G. C.
Gilley, F. P. Mathur, D. A. Rennels, J. A. Rohr, D. 
K. Rubin; IEEE Trans. Comput.; Vol. C-20, No. 11; 
November, 1971; pp. 1312-1321.
4.22) Design techniques for reliable hardware - F. P.
Maison; Infotech*; pp. 236-251.
4.23) Fault diagnosis in computer control systems - G.
Edge; Systems Technology; No. 31; April, 1979; pp. 
33-41.
4.24) Availability, reliability and maintainability 
aspects of the Sperry Univac 1100/60 - L. A. Boone, 
H. L. Liebergot, R. M. Sedmak; FTCS-10*; pp. 3-8.
4.25) Installability, availablity, reliability and main­
tainability aspects of the Sperry System 11 - A. K. 
Bhatt, D. R. Mueller; FTCS-14*; pp. 29-35.
4.26) Dependable computing and fault tolerance: Concepts
and terminology - J. Laprie; FTCS-15*; pp. 2-11.
4.27) Computer systems reliability: An overview - A.
Avizienis; Infotech*; pp. 216-233.
4.28) Error detection process - model, design, and its 
impact on computer performance - K. G. Shin, Y. 
Lee; IEEE Trans. Comput.; Vol. C-33, No. 6; June, 
1984; pp. 529-540.
-87-
4.29) On-line bus fault diagnosis in microprocessor
systems - D. P. Agrawal, V. K. Agarwal; J. of Dig.
Syst.; Vol. 4, No. 4; Winter, 1980; Computer
Science Press Inc., USA; pp. 377-391.
4.30) Error correction by alternate-data retry - J. F. 
Shedlestky; IEEE Trans. Comput.; Vol. 27, No. 2; 
1978; pp. 106-112.
4.31) Built-in test: A review - C. Maunder; IEE Electron­
ics and Power; March, 1985; pp. 204-208.
4.32) as per 4.6) p. 9.
4.33) The TTL book for design engineers - The engineer­
ing staff of Texas Instruments components group;
Texas Instruments; Fourth European edition; 1980; 
pp. 7/269-7/270 & 7/406-7/409.
4.34) Parity prediction in combinational circuits - B. 
Khodadad-Mostashiry; FTCS-9*; pp. 185-188.
4.35) as per 4.6); pp. 129-133 & 144-146.
4.36) as per 4.4); pp. 294-299.
4.37) as per 4.6); pp. 22-24.
4.38) b-adjacent error correction - D. C. Bossen; IBM J.
Res. Devel.; Vol. 14 No. 4; July 1970; pp. 402-408.
4.39) Design for self-verification : An approach for 
dealing with testability problems in VLSI-based 
designs - R. M. Sedmak; 1979 Test Conf.*; pp. 112- 
120.
4.40) Redundancy by coding versus redundancy by replica­
tion for failure-tolerant sequential circuits - R. 
W. Larsen, I. S. Reed; IEEE Trans. Comput.; Vol. 
C-21, No.2; February 1972; pp. 130-137.
4.41) as per 4.6); pp. 46-48, 114-5, 118-9 & 133-140.
4.42) as per 4.6); pp. 133-140.
4.43) as per 4.6); pp. 37-46, 71-74, 114-115 & 123-124.
4.44) Arithmetic codes: Cost and effectiveness studies 
for application in digital system design - A. 
Avizienis; IEEE Trans. Comput.; Vol. C-20, No. 11; 
November, 1971; pp. 1322-1331.
4.45) Error codes for arithmetic operations - H. L. 
Garner; IEEE Trans. Electronic Comput.; Vol. EC-15, 
No. 5, May, 1966; pp. 763-770.
4.46) as per 4.6); pp. 40-46.


















as per 4.6); pp. 38-46, 71-74, 114-115 & 147-148.
as per 4.6); pp. 48-50.
Modified Berger codes for the detection of uni­
directional errors - H. Dong; FTCS-12*;pp. 317-320.
Design of fast self-testing checkers for a class of 
Berger codes - S. J. Piestrak;FTCS-15*;pp. 418-423.
On totally-self-checking checkers for separable 
codes - M. J. Ashjaee, S. M. Reddy; FTCS-7*; pp. 
151-156.
Design of self-checking checkers for Berger codes
- M. A. Marouf, A. D. Friedman;FTCS-8*;pp. 179-184.
as per 4.6); pp 24-34.
as per 4.7); pp. 256-265.
Memory finds and fixes errors to raise reliability 
of microcomputer - A. Heimlich, J. Korelitz; 
Electronics; January 3, 1980; pp. 168-172.
Application of shift register approach and its 
effective implementation - M. Kawai, S. Funastu, A. 
Yamada; 1980 Test Conf.*; pp. 22-25.
Enhancing testability of large-scale integrated 
circuits via test points -and additional logic - M. 
J. Y. Williams, J. B. Angell; IEEE Trans. Comput.; 
Vol. C-22, No. 1; January, 1973; pp. 46-60.
Structured-test devices simplify test generation
- S. Gupta; EDN; 14 November, 1985; pp.289-298.
A logic design structure for LSI testability - E. 
B. Eichelberger, T. W. Williams; 14th Design Auto­
mation Conference; New Orleans, Louisiana, USA; 
June 1977; IEEE Pub. No. 77 CH1216-1C; pp. 462-468.
Level-sensitive scan design test chips, boards, 
system - N. C. Berglund; Electronics; 15 March, 
1979; pp. 108-110.
Design for testability of the IBM System/38 - L. 
A. Stolte, N. C. Berglund; 1979 Test Conf.*; pp. 
29-36.
Design for testability - T. W. Williams, K. P. 
Parker; Proc. of the IEEE; Vol. 71, No. 1; January, 
1983; pp. 98-112.
Testing logic networks and designing for testabil­
ity - T. W. Williams, K. P. Parker; Computer; 

















Test generation techniques - S. B. Akers; Computer; 
March, 1980; pp. 9-15.
LSI logic testing - an overview - E. I. Muehldorf, 
A. D. Savkar; IEEE Trans. Comput.; Vol. C-30, No. 
1; Janaury, 1981; pp. 1-16.
Incomplete scan path with an automatic test gen­
eration methodology - E. Trischler; 1980 Test 
Conf.*; pp. 153-162.
BIDCO, Built-in digital circuit observer - P. P.
Fasang; 1980 Test Conf.*; pp. 261-265.
Built-in logic block observation techniques - B.
Konemann, J. Mucha, G. Zwiehoff; 1979 Test Conf.*; 
pp. 37-317.
Built-in test for complex digital integrated cir­
cuits - B. Konemann, J. Mucha, G. Zwiehoff; IEEE 
J. of Solid-State circuits; Vol. SC-15, No. 3;
June 1980; pp.315-318.
Design for autonomous test - E. J. McCluskey, S. 
Bozorgui-Nesbat; 1980 Test Conf.*; pp. 15-21.
Verification testing - A pseudoexhaustive test 
technique - E. J. McCluskey; IEEE Trans. Comput.; 
Vol. C-33, No. 6; June, 1984; pp. 541-545.
Structured design for testability to eliminate test 
pattern generation - S. Bozorgui-Nesbat, E. J. 
McCluskey; FTCS-10;* pp. 158-163.
Design for autonomous test - E. J. McCluskey, S. 
Bozorgui-Nesbat; IEEE Trans. Comput.; Vol. C-30, 
No. 11; November, 1981; pp. 866-874.
Autonomous testing and its application to testable 
design of logic circuits - H. Eiki, K. Inagaki, S. 
Yajima; FTCS-10*; pp. 173-178.
The TTL book for design engineers - The engineer­
ing staff of Texas Instruments components group; 
Texas Instruments; Fourth European edition; 1980; 
pp. 7:271-7:281.
Off-line, built-in test techniques for VLSI cir­
cuits - M. G. Buehler, M. W. Sievers; Computer; 
June, 1982; pp. 69-82.
Syndrome testable design of combinational circuits 
- J. Savir; IEEE Trans. Comput.; Vol. C-29, No. 6; 
June, 1980; pp. 442-451.
Radamacher Walsh spectral techniques: A new tool
for digital network fault diagnosis - R. G. Ben­
netts, S. L. Hurst; FTCS-7; p. 213.
/
-90-
4.79) The logical processing of digital signals - S. L. 
Hurst; Crane, Russak, New York; 1978.
4.80) Testing by verifying Walsh coefficients - A. K. 
Susskind; FTCS-11*; pp.206-208.
4.81) Self-testing supercells: Alternative test strate­
gies - D. K. Bhavsar;Autotestcon '80*;pp. 135-139.
4.82) Enhance computer fault isolation with a history 
memory - G. L. Fitzgerald; Autotestcon '80*; pp. 
131-133.
4.83) as per 4.7); pp. 292-294.
4.84) Motorola MC6805P2 8-bit microcomputer unit; data
sheet; Motorola Semiconductor Products Inc.; 
Scotland, UK; 1982
4.85) Let your next microcomputer check itself and cut
down your testing overhead - J. Boney; Electronic 
Design; No. 18; 1 September 1979; pp. 100-105.
4.86) Motorola 6805 self test programs; Motorola Semi­
conductor Products Inc.; Scotland, UK.
4.87) Handling exceptions gracefully enhances software
reliability - T. W. Starnes; Electronics; 11 
September, 1980; pp. 153-157.
4.88) Motorola M68000; data sheet ADI-814-R1; Motorola
Semiconductor Products Inc., USA; 1980.
4.89) Intel iAPX 43201/43202 VLSI general data processor; 
data sheet; Intel Corporation ;1981.
4.90) Error reporting in the Intel iAPX 432 - D. B.
Johnson; FTCS-14*; p. 24-28.
4.91) Diagnostic PROMs enhance PCB testability - J.
Gabris; New Electronics;11 December 1984;pp. 37-43.
4.92) Monolithic Memories 53DA841/63DA841 2048 x 4 diag­
nostic registered PROM; data sheet; Monolithic 
Memories.
4.93) Semicustom-logic suppliers differ on how to best
deal with testability - W. Twaddell; EDN; 24
September, 1982; pp. 69-73.
4.94) 'Expandable' IC finds and fixes errors fast - C.
Carinalli, M. Evans; Electronic Design; 18 February 
1982; pp. 177-186.
4.95) Structured arrays mix gate arrays and standard
cells - R. Rasmussen; New Electronics; 18 February
1986; pp. 40-42.
-91-
4.96) Gate array test circuits generated automatically
- J. Vickerton; New Electronics; 18 February, 1986; 
pp. 58-60.
4.97) Implementation techniques for self-verification
- R. M. Sedmak; 1980 Test Conf.*; pp. 267-278.
4.98) Fail-safe computers increase reliability, lower 
costs - G. Kravetz; Mini-Micro Systems; April, 
1984; pp. 121-130.










Fault Isolation it Indication






[4 .9  -  4.13]
MAINTENANCE PROCESSOR 
Large computors 
[4 .3  -  4.5]
ON-UNE TESTING 
:ccts. tested as used 
: self verification
AD HOC CIRCUITS 
eg. Power supply monitor 
Watchdog timer 
Information displays
RETRY/ROLLBACK it RECOVERY 
Prevention of erroneous data 
reaching outputs 
[4.14,4.21,4.24,4.28 -  4.30]
REDUNDANCY >2 
eg. Triple modular
[4 .3  -  4.5]
CODING 
eg. Parity for buses 
Residue for arithmetic
OFF-UNE TESTING 
ccts.tested outside normal operation
SCAN DESIGN 
Scan Path — normal flip-flops 
LSSD — level sensitive latches
BUILT IN LOGIC OBSERVER 
Parallel latch,shift register 
or signature register
(b)















TEST PATTERN GENERATION 











N= 4 so number of patterns = 15
SEQUENCE 01 02 03 04
1 1 1 1 1
2 0 1 1 1
3 0 0 1 1
4 0 0 0 1
5 1 0 0 0
6 0 1 0 0
7 0 0 1 0
8 1 0 0 1
9 1 1 0 0
10 0 1 1 0
11 1 0 1 1
12 0 1 0 1
13 1 0 1 0
14 1 1 0 1
15 1 1 1 0
1 1 1 1 1










*•—  Final signature






































- CODE - 
CHECKERS
FIGURE 4.5 CODE GENERATION & CHECKING
XI, X2 and X3 are codewords : X4 is a non codeword 
fl is an undetectable error : f2 is a detectable error
FIGURE 4.6 FAILURES IN AN ERROR DETECTING CODE
FIGURE 4.7 GENERATION OF ODD AND EVEN PARITY
correct parity indicated by:
0 for even parity
1 for odd parity
parity bit
FIGURE 4.8 CHECKING SCHEME FOR PARITY - EOR TREE
-96-
51 0 1 0  1
2




Parity now even 
FIGURE 4.9 PARITY DURING AN ARITHMETIC PROCESS
MEMORY
■b parity bitsPI
XI ©  X5Pl=
P2= X2 ©  X6 
P3= X3 ©  X7 


































































FIGURE 4.12 CHECKING SCHEME FOR CHECKSUM CODES-HARDWARE
-98-
(10001010) 
AX — (11000011) 






X = 461q = 1011102 
Y = 191q = 0100112
A = 310 = 112 
n = 6
(no. of additional bits=2)
(0)


















































n = 7 








M2 Ml MO M( x) V(x)=M(x)G(x) V6 V 5 V4 V3 V2 VI VO
0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1
4 3 2 ,  
X + x >x +1 0 0 1 1 1 0 1
0 1 0 X
5 4 3 
X  +X -t-X +x 0 1 1 1 0 1 0
0 1 1 x+1 5 2 X +X + X+1 0 1 0 0 1 1 1
1 0 0
2
X 6 5 4 2 X +x +x +x 1 1 1 0 1 0 0
1 0 1 X2+1
6 5 3 , 
X  + x +x +1 1 1 0 1 0 0 1
1 1 0
2X +x
6 3 2 
X +x +x +x 1 0 0 1 1 1 0
1 1 1
2 , 
X  + X+1
6 4 
X  +x +X+1 1 0 1 0 0 1 1
nt \ 4 3 2 ,G(x) = x +x +X +1 n * 3 k = 7 r = 4
FIGURE 4.18 GENERATION OF A (7,3) NOH-SYSTEMATIC CYCLIC CODE
INFORMATION BITS
CODEWORD 
V(x)=x M(x )-C(x )
M2 Ml MO x 4 M ( x ) C(x)=REM(x4M(x)/G(x)) V6 V5 V4 V3 V2 VI VO
0 0 0 0 0 0 0 0 0 0 0 0
0 0 1
4





X  +X+1 0 1 0 0 1 1 1
0 1 1
5 4 





X +x -*-x 1 0 0 1 1 1 0
1 0 1
6 4 
X  +x X-t-1 1 0 1 0 0 1 1
1 1 0
6 5 
X  +x x3*l 1 1 0 1 0 0 1
1 1 1
6 5 
X  +x +x
4 2
X 1 1 1 0 1 0 0
r ! \ 4 3 2Glx)=x +x +x +1 original information bits
FIGURE 4.19 GENERATION OF A (7,3) SYSTEMATIC CYCLIC CODE
= ) D —
CLEAR
D o D 0 D 0 D 0
CLR CLR CLR CLR










d= Hamming distance 
e= no. of errors detected 
f= no. of errors corrected










improperly corrected double error from X 
(looks like a single fault from Y)
FIGURE 4.22 FRAGMENT OF A DISTANCE 3 ERROR CORRECTING CODE





Scan Enable O  
Scan Data In O' 
(SDI)
System Clock O' 
(CK)



















































L2 L3 L4 SCAN
DATA OUT














































































B1 _  
B2 ^  
SDI ,
CLK
p f C K
■i
(a) COMPLETE STRUCTURE
p f  CK





















"t>°^ D 0 
•* CK
”t>°— D ' 0
CK




(c) B1 = B2 - O




(d) B1 = l, B2 = O
CLK
CK ■CKCK CK
(e) B1 = O. B2 = 1











FIGURE 4.29 TESTING WITH BILBOS
PRIMARY INPUTS COMBINATIONAL
LOGIC






>  SCAN DATA OUT





























N S MODE - M
1 X NORMAL OPERATION
0 0 LFSR DATA GENERATOR
0 1 LFSR DATA COMPRESSOR




G «=*> G1 G2

















FIGURE 4.33 PARTITIONING WITH MULTIPLEXERS
(a) SUBCIRCUIT 
CREATION



































































FIGURE 4.35 STRUCTURE OF THE SERIAL SHADOW REGISTER
INPUTS | OUTPUTS OPERATION
MODE SDI CLK DCLK||B0-B7 S0-S7 SDO
0 X t A Dn*Bn HOLD S7 LOAD OUTPUT REGISTER FROM INPUT BUS
0 X A r HOLD
Sn-l+Sn
SDI+SO
S7 SHIFT SHADOW REGISTER DATA
0 X t t Dn-*Bn Sn-l+SnSDI-»SO
S7 LOAD OUTPUT REGISTER FROM INPUT BUS 
WHILST SHIFTING SHADOW REGISTER DATA
1 X t A Sn+Bn HOLD SDI
LOAD OUTPUT REGISTER FROM SHADOW 
REGISTER
1 0 A t HOLD Bn-*Sn SDI LOAD SHADOW REGISTER FROM OUTPUT BUS
1 0 t t Sn*Bn Bn-»Sn SDI SWAP SHADOW REGISTER AND OUTPUT REGISTER
1 1 A t |hold HOLD SDI NO OPERATION : D0-D7 ARE OUTPUTS
t * rising edge of clock * * clock steady or falling






















































-  - -
PRIMARY INPUTS ISG
SUPERVISOR
FIGURE 4.38 SELF VERIFICATION
CHAPTER FIVE : SELF CHECKING CIRCUITS - AN INTRODUCTION
5.1 : INTRODUCTION
A circuit which has its output encoded in an error
detecting code is termed a self checking circuit C5.1,
5.23, as shown in Fig 5.1. During failure free operation
its output is always a code word (see section 4.3). The 
output of a self checking circuit is monitored by a 
checker which indicates the presence of non codewords, as 
shown in Fig. 5.2. A self checking circuit has the 
properties of 'self test' and 'fault security'. These are 
defined by Anderson C5.23 as follows.
DEFINITION 5.1 : A circuit is self testing if, for every 
fault from a defined set, it produces a non codeword
output for at least one codeword input.
DEFINITION 5.2 : A circuit is fault secure if, for every 
fault from a defined set, it never produces an incorrect 
codeword output for codeword inputs.
DEFINITION 5.3 : A circuit is totally self checking if it 
is both self testing and fault secure.
A circuit could be self testing and fault secure for 
different input sets and different fault sets, so the 
following is proposed. A circuit is self testing for an 
input set of codewords X and fault secure for an input set
of codewords N, where N is a subset of X. If N equals X,
then the circuit is totally self checking as defined 
above. If N is a null set then the circuit is self
testing only and not fault secure at all. If N is a non 
null subset of X then the circuit is partially self
checking.
The self testing and fault secure properties of sequential 
circuits can be similarly defined as follows.
s
-113-
DEFINITION 5.4 : A sequential circuit is self testing if, 
for every fault from a defined set, it produces a non
codeword output for at least one codeword input.
DEFINITION 5.5 : A sequential circuit is fault secure if, 
for every fault from a defined set, it never produces an 
incorrect codeword output for codeword inputs.
Only combinational circuits are considered during the 
first part of the Chapter, with formal definitions of 
totally self checking, partially self checking and self
testing only circuits. Some examples of totally self
checking circuits are described in section 5.3. In 
addition, sufficient conditions are given for a network to 
be totally self checking. Totally self checking checkers 
are essential for checking codes, so section 5.4 details 
their general characteristics and design requirements, 
together with checkers specifically for separable codes. 
Section 5.5 presents the partially self checking property 
and its use for networks performing logical operations. 
Section 5.6 discusses self testing only circuits with 
sufficient conditions for self testing only networks. 
Section 5.7 analyses the self testing and fault secure 
properties for bit and byte-sliced circuits, whilst 
section 5.8 describes self checking sequential circuits.
Parallel processing of information (encoding and decoding) 
is assumed throughout, since this is the most appropriate 
method for computers.
5.2 : DEFINITION OF TERMS
Combinational circuits are considered to produce a fault 
free output vector Z(X^), where X^ is an input vector, see 
Fig. 5.3a. With a fault f in the circuit this becomes 
Z (X^), as shown in Fig. 5.3b. Faults will be one or more 
lines stuck at a logic value of 1 or 0 (SA1 or SAO); 
i.e. single and multiple faults. Unidirectional multiple 
faults have all affected lines stuck at the same value.
-114-
A circuit whose output Z(X^) is encoded in an error 
detecting code S is referred to as a self checking circuit 
115.1,5.21, see Fig. 5.3c. During fault free operation, 
the output of such a circuit is always a codeword S^, 
where:
Si = Z(Xi) and S ± € S (5.1)
The set S is termed the output code space or output code 
set. A checker monitors the output of the circuit ,
detecting and indicating the presence of non codewords. A 
fault can change the output into a non codeword (detecta­
ble), or an incorrect codeword (undetectable). The fault 
secure property of self checking circuits limits the 
occurrence of undetectable errors.
DEFINITION 5.6 : A circuit is fault secure for an input
set N and a fault set Fg if, V f € Fg and V N^ 6 N, then 
Zf(Ni) = Zd^) € S or Zf(Ni) S.
Thus the output of a fault secure circuit is always 
correct, or is a non codeword, as indicated in Fig. 5.4. 
The input set N is termed the secure input set (not 
necessarily all code inputs) and Fg the secure fault set. 
Fault security ensures that a fault will not cause an 
undetectable error, but does not indicate that it will 
eventually produce a detectable error; this is covered by 
the self testing property for self checking circuits.
DEFINITION 5.7 : A circuit is self testing for an input 
set X and a fault set Ffc if, V f 6 F^ ., 3 X. 6 X such 
that Zf(X.) 4 S.
Fig. 5.5 illustrates this definition. The input X. for
f 1
which Z (Xi) is not in S is termed a test for f and Ffc 
is the tested fault set.
X is termed the normal input set and it is assumed that 
*
-115-
all X^ occur during normal operation. Then the self 
testing property guarantees that all faults in F^ _ will 
produce detectable errors.
U is the set of all possible n-bit vectors given by:
U  = t U t =  < 0 ^ 2 ______ “ n H ' - ' j  €  t O f 1 3 , l C j < n , H i v < 2 n J ( 5 . 2 )
which is abbreviated to:
U = £0,13n : the set of all n-bit combinations
of 1 and 0 (5.3)
Thus:
X C  U and S C  U (5.4)
The secure input set N is assumed to be a subset of X,
N C  X. Fault security specifies circuit behaviour for
codeword inputs only. Although a circuit may be fault
secure for inputs outside N, these are not of interest
since they do not occur during normal operation. F iss
also assumed to be a subset of F. , F C  F. .t s — t
The properties of self testing and fault security are also 
illustrated in Fig. 5.6. This depicts the set of all 
faults F with its subsets Fg and Ffc, the set of all input 
vectors U with subsets X and N and the set of all output 
vectors with subset S. Self test is demonstrated by the 
existence of a test in X for the faults fl, f2 and f3 in F^ 
(X^ or X2, Xg and X^ respectively). Fault security is 
illustrated by the behaviour of various faults on Z(X^). 
Erroneous codewords may be produced from faults outside 
Fg (eg. Z^(X2)) or inputs outside N (eg. Z^tX^)).
Fault security specifies the behaviour of a circuit for
codeword inputs only. It is also important to know how
circuits perform with non codeword inputs. The code
disjoint property covers this aspect of operation.
*
-116-
DEFINITION 5.8 : A circuit is code disjoint for an input 
code S' and an output code S if, V X^  ^S' then Z(X^) ^ S.
The fault free output function of a circuit which is code 
disjoint maps codeword inputs Xj, € S' to codeword outputs 
S^ 6 S and maps non codeword inputs X^ S' to non 
codeword outputs S^  ^S. Fig. 5.7 illustrates this 
property.
Given X, F^ is determined to be the largest fault set for 
which the circuit is self testing. Fg is chosen on the 
basis of faults that are likely to occur (eg. all single 
stuck-at faults) and then N is determined to be the largest 
input set for which the circuit is fault secure. N is 
therefore controlled by the choice of F , but depending 
on whether N contains all, some or none of X, then the 
circuit is defined as totally self checking, partially 
self checking, or self testing only. These are defined as 
follows.
DEFINITION 5.9 : A totally self checking (TSC) circuit is 
self testing for fault set F^ _ with normal input set X and 
fault secure for fault set Fg with input set X.
In a TSC circuit, all faults in Ffc are tested and no fault 
in Fg causes an undetectable error during normal operation 
However, there are circuits which are fault secure for 
only a subset of normal inputs.
DEFINITION 5.10 : A partially self checking (PSC) circuit
is self testing for fault set F^ with normal input set X,
but fault secure for fault set F„ with a non null subset Ns
of X.
In a PSC circuit, all faults in F^_ are testing during 
normal operation. Faults in Fg will not cause undetect­
able errors when inputs are from N, but can when inputs 
are from the set X-N.
-117-
Finally, a particular choice of Fs might result in there 
being no inputs for which the circuit is fault secure.
DEFINITION 5.11 : A self testing only (STO) circuit is
self testing for fault set F^ with normal input set X, but
fault secure for fault set F with a null subset N of X.s
In a STO circuit, all faults in F^ are tested during 
normal operation, but undetectable errors can occur at any 
time, as shown in Fig. 5.8.
The operation of these three forms of a self checking 
circuit can be summarised as follows:
1) TSC circuits and PSC circuits operating in their
secure mode (inputs from N) always output correct
codewords for any fault in F .s
2) STO circuits . and PSC circuits operating in their
insecure mode (inputs from X-N) can output incorrect 
codewords.
3) All self checking circuits have the self testing 
property, which guarantees that any fault from F^ will 
eventually be detected (eventually since the occur­
rence of input Xj, from X, which produces a detectable 
error for fault f, cannot be specified).
The likelihood of an undetectable error during the in­
secure mode of a PSC circuit is dependent on its usage in 
that mode. If the insecure mode is used infrequently, it 
is likely that a fault will be detected during its secure 
mode before any erroneous result is produced during its 
insecure mode. Under these conditions, even undetectable 
errors occurring during its insecure mode will soon be 
detected when it reverts to the secure mode.
There is always a chance that transient errors occurring 
during an insecure mode will never be detected, so, for 
critical applications, the use of PSC or STO circuits may
-118-
be ruled out. In non critical applications an immediate 
detection of errors is not essential, only an eventual 
detection is required. Here low cost STO and PSC circuits 
can be employed instead of the more expensive TSC 
circuits.
Another point worth considering is the probability that a 
fault will be detected within t units of time from its 
occurrence. The expected value of t has been referred to 
as the 'error latency' of a circuit C5.3D. Hardware 
detection of single failures is only vaild if the error 
latency of the circuit is much less than the expected time 
for a second fault to occur. The circuit might otherwise 
experience a multiple fault for which it is insecure.
5.3 : TOTALLY SELF CHECKING CIRCUITS AND NETWORKS
In totally self checking (TSC) circuits, the set of secure 
inputs N is the same as the normal input set X, so only 
the latter needs to be mentioned. A number of simple TSC 
circuits are presented, which either process their inputs 
without modification, or transform them into other code­
words. These and other TSC circuits may be interconnected 
to form a network, but this network will, in general, not 
be TSC. Sufficient requirements are therefore given to 
ensure that it is.
5.3.1 : TSC Circuits
Example 5.1 : The k-bit parity buffer shown in Fig. 5.9, 
which consists of k identical buffer gates, is TSC. The 
output code space S equals the normal input set X and is 
the set of all odd parity k-bit vectors. A single stuck-
at fault will either have no effect on the output, or
change it from an odd parity codeword to a non codeword, 
since a single bit change will always create an even
parity codeword. A multiple fault which causes an even
number of bits to change at the output is not detectable 
since it creates an incorrect codeword. This situation
-119-
occurs for at least one code input (in the presence of all 
multiple faults), so the circuit is fault secure for F , 
the set of single stuck-at faults. However, there is at 
least one code input which will produce a non code output 
(change an odd number of bits) for all multiple faults, 
except where all gates are affected (e.g. the output stuck 
at a codeword). The circuit is therefore self testing for 
Fj., the set of all faults affecting fewer than all k-bits.
Fig. 5.10 illustrates these points for various faults.
The circuit is thus TSC.
Example 5.2 : The multiplexer for two k-bit parity encoded 
words shown in Fig. 5.11 is also TSC. When control lines 
<s9s,> = <01> input vector A. becomes output vector Y., 
whilst <s2s^> = <10> tranfers input vector to the
output. If S is the set of all odd parity k-bit vectors, 
the input set X is:
X = i <s2s1AiBi|<s2s1>=<01> and A ± e  S
or <s2s^>=<10> and B^ € S3 (5.5)
When the multiplexer is set to transfer A^ to any
fault affecting the data path of each individual bit is an 
identical problem to the set of buffers in example 5.1. A 
fault in the control path of s^ (eg. AND gate input SAO) 
for a particular bit will affect the data flow at the 
output of its input AND gate (Fig. 5.11) and is therefore 
equivalent to a fault in the data path. This hypothesis 
is also valid for a transfer of B.. to Y^ in conjunction 
with control line s2, so the multiplexer is TSC for the 
same fault sets given in example 5.1. If s2 and s^ fan­
out from a common source, then faults before the fan-out 
point are not detectable, in general, at the multiplexer 
output, since they could represent another valid control 
input, as shown in Fig. 5.12. These faults will have to 
be detected by some other means such as, for example, 
encoded control lines.
Examples 5.1 and 5.2 transfer an input vector to their
-120-
outputs without modification, so S could be any error 
detecting- code and the circuits would still be TSC, 
although not neccessarily for the same faults sets. 
Consider now an example which combines two input codewords 
to produce a different output codeword. It performs a 
function Z(X.,Y^) such that X. and € S implies
Z(Xi,Yi> € S, i.e. function Z preserves the encoding.
lr
Example 5.3 : A modulo 2 -1 ripple adder for data encoded 
in an AN code is described in section 4.3.5 and illustra­
ted in Fig 5.13a. Assuming that the carry and sum bits of 
each full adder are computed with independent subcircuits, 
as shown in Fig. 5.13b, then a single fault in the circuit 
will have one of the following effects:
1) No effect.
2) Carry out is 1 when it should be 0, or vice versa.
3) Sum out is 1 when it should be 0, or vice versa.
Effects 2) and 3) will produce a result which is incorrect 
by a power of two. Clearly, if the multiplying factor of 
the code , A, is a power of two, then the output becomes 
an incorrect codeword. However, if low cost AN codes are 
used, where A = 2^-1|b an integer > 1 (see section 4.3.5), 
this problem does not arise and the adder is fault secure 
for all single stuck-at faults. Fig. 5.14 shows the 
effect of these faults for A=2 and A=3. There is at least 
one code input combination which will produce a non code 
output for faults affecting fewer than all k-bits, so 
under these conditions the circuit is self testing and 
hence TSC.
5.3.2 : TSC Networks
The difficulties of providing general design rules for TSC
circuits (see next Chapter) means that large TSC networks
are often formed from the interconnection of a number of
smaller TSC logic blocks. However, these blocks may be
TSC in isolation, but when interconnected to form a large 
✓
-121-
network, they could be neither fault secure or self 
testing. A block may not be fault secure because a non 
code output may not propagate to the network outputs. In 
addition, the block may not be self testing because its 
inputs are outputs from other blocks. As a result, the 
required patterns to ensure self test might not exist 
during normal operation.
Anderson C5.23 gives a theorem for TSC networks with 
sufficient although not necessary conditions. Before 
presenting this theorem, two further definitions are 
required.
DEFINITION 5.12 : A block of a network is defined to be
fully exercised if it receives its entire input code space
B from the application of normal input set X to the
network.
This definition is illustrated in Fig. 5.15.
DEFINITION 5.13 : A block of a network is defined to be
securely located if, on the basis of its inputs, the
preceding subnetwork of the total network is fault secure,
i.e. its input encoding will detect all f 6 Fsns in the
previous subnetwork, where F,  is the union of F_ forsns s
each block in this subnetwork.
This definition is illustrated in Fig. 5.16.
THEOREM 5.1 : An interconnection of logic blocks is TSC
for the network secure fault set F (F is the union ofns ns
Fg for each block) and the network self testing fault set
F . (F . is the union of F. for each block), if each blocknt nt. t
is:






Proof: The network must he shown to he fault secure and 
self testing.
Fault Security: Since each hlock is fault secure for its 
F , a faulty block generates either the correct codeword 
or a non codeword at its outputs. For a non code output, 
succeeding blocks will not receive an incorrect code input 
as they are securely located for ^sns* They, in turn, 
will produce non code outputs because they are code 
disjoint. Overall, a non code output from a faulty block 
propagates to the final output of the network as a non 
codeword. Hence the network is fault secure.
Self test: Since each block is self testing for its F^ and 
fully exercised, a faulty block will always produce a non 
code output for some network input. By the same argument 
given above, the non code output always propagates to the 
output of the network as a non codeword. Hence the 
network is self testing.
The most difficult condition to satisfy is the secure 
location of each block. Anderson C5.23 give two coroll­
aries to achieve this for specific fault classes. These 
are given here without proof.
Corollary 5.1 : A network consisting of an interconnec­
tion of blocks is TSC for single faults (F^= F .) if the3 ns nt
network has no reconvergent fan-out of block outputs and 
each block is i) TSC for single faults, ii) code disjoint 
and iii) fully exercised.
Corollary 5.2 : A network consisting of an interconnec­
tion of blocks is TSC for unidirectional faults if the 
network contains no inverters and each block is i) TSC for 
unidirectional faults, ii) code disjoint, iii) fully 
exercised and its inputs are encoded in a unidirectional 
error detecting code.
-123-
5.4 : TOTALLY SELF CHECKING CHECKERS
5.4.1 : General Structure
The output of a self checking circuit is monitored by a 
checker which indicates the presence of non codewords. 
The output from the checker could be a 0 for codewords and 
a 1 for non codewords, as indicated in Fig. 5.2. However, 
the checker needs to be totally self checking so that 
faults within it can be detected just as easily as those 
in the circuit it is monitoring. It must not allow a 
detectable error (non codeword) in the circuit it is
monitoring to be ignored because of a failure within 
itself. Its output must therefore be encoded in an error 
detecting code Sc. The encoding suggested above is
clearly not an error detecting code, as a single fault, 
the output SAO (no error), would never be detected.
Suitable error detecting codes have been mentioned in 
section 4.3.3. These are the duplication code, S = t<00>, 
<11>3, and the l-out-of-2 code, = €<01>,<10>1. The 1- 
out-of-2 code is preferred since it detects unidirectional 
multiple errors (see section 3.1.4), such as those 
produced by the loss of power, and this is the code which 
has been generally adopted.
Lines constantly at one logic level are thus not allowed 
in a network which is to be self testing, since the lines 
would not be tested for being stuck at that value. This 
must also be true of the checker output, even if it is 
encoded. If a failure (or failures) within the checker 
caused its output to be permanently <10>, for example, 
then this is just as undetectable as the single line SAO 
above. Therefore, the checker output must alternate 
between <01> and <10> during normal operation to indicate 
an error free network. This means that there must be at 
least one code input to the checker which is mapped to 
each of these outputs.
-124-
The system where 0 is represented as <01> or <10> and 1 
represented as <11> or <00> (or vice versa) is also known 
as Morphic logic, where Morphic Boolean functions replace 
the standard functions of OR, AND and NOT C5.4,5.5D.
Anderson C5.23 presented the model given in Fig. 5.17 for 
a totally self checking network, which consists of a 
functional circuit and a checker which are both TSC. The 
functional circuit has a normal input set X^ . and output 
set S^, whilst the checker has a normal input set Xc and 
output set Sc« In addition:
The fault free output function of the checker must map
code inputs to code outputs and non code inputs to non 
code outputs, so it must be code disjoint.
It is interesting to note that the checker does not
necessarily need to be fault secure. It must certainly 
be self testing and code disjoint, but if a fault produces 
an incorrect output codeword this is not a problem, since 
as an error indicator, the output of the checker is either 
a codeword (fault free) or non codeword (error). The 
actual codeword is not important.
Applying Theorem 5.1 to Fig. 5.17 demonstrates that the
network is TSC. Its secure and tested fault sets are
the unions of corresponding fault sets for functional 
circuit and checker. The normal input set of the network 
is X^ . and the output code set is defined as;






Sn = Sc x £0,l}m (5.10)
This definition allows errors to be detected from an
observation of the checker output only. If the output of 
✓
-125-
the network was specified as Sc alone, then network 
fault security would not guarantee that a code space 
output from the checker implied a correct functional 
circuit, see example 5.6. During fault free operation the 
output of the network is Sc x S^ ..
Chapter 6 describes the design of several TSC checkers.
5.4.2 : TSC Checker for Separable Codes
If the error detecting code is separable, then a checker 
for that code will consist of an equality checker and a 
check bit generator, as shown in Fig. 5.18. The equality 
checker compares the check bits of the codeword with new 
check bits generated from the data part of the codeword. 
The following theorem demonstrates that a checker is TSC 
if the equality checker is TSC:
THEOREM 5.2 : For the conditions:
i) A separable error detecting* code S with codewords 
<CifX^>, where X.^ € X is the original data and Ci € C 
the check bits generated from C^ = F(X^). 
ii) The checking network consists of a check bit generator 
which computes F(X^) and an equality checker which 
compares C^ and F(X^) (see Fig. 5.18).
The network is a TSC checker for S if the equality checker 
is TSC for the duplication code S', where:
S' = £<C.,C. > I 3 X. 6 X such that FtX^ = C^J.
The proof for this theorem is given by Wakerly in C5.6U. 
The equality checker has to be TSC for S' to satisfy the 
above theorem. It has two sets of C^ as inputs, which 
form S'. In most cases C^ will take on all possible 
combinations, C^ € C and C C u, so the equality checker 
will automatically be TSC. However, if the check bits do 
not take on all possible values, as for the residue codes 
illustrated in Fig. 5.19, then the self testing property
-126-
must be verified for those that do. The theorem also 
depends on the data part of S taking on all possible 
values to completely test the check bit generator for 
faults which affect its operation. Again, if not all 
these occur, then the structure of the check bit generator 
must be examined to verify its self testing property for 
those which do.
Overall the problem of designing TSC checkers for separa­
ble codes can be reduced to the much simpler problem of 
designing check bit generators for these codes (see 
examples in section 4.3), with no consideration of their 
self checking properties.
5.5 : PARTIALLY SELF CHECKING NETWORKS
If a logical operation is performed on two codewords the 
result is not, in general, a codeword; i.e. the code is 
not preserved. If the encoding is separable (separate 
data and check bits), then a logical operation performed 
on the two data parts will he correct, but the same 
operation performed on the two sets of check bits will 
not, in general. Duplication, however, is one code that 
is preserved by logical operations, but of course has a 
high redundancy.
A solution is to calculate new check bits from the data 
output after the logical operation. This has previously 
been referred to in sections 4.3.1 and 4.3.4 as check bit 
prediction. This is a practical scheme if it can be 
implemented in a self checking manner at low cost. 
Partially self checking (PSC) networks fulfill this 
requirement. They achieve less redundancy by not re­
quiring the immediate detection of errors.
Wakerly C5.6,5.73 has proposed three forms of PSC network, 
which are detailed below.
5.5.1 : Type 1 Networks
-127-
The simplest PSC network is referred to as the type 1 
model and shown in Fig. 5.20. It uses a TSC functional 
circuit with a fault free output function which maps the 
normal input set, X^ . C £0,13n to an output code space,
S^. C £0,13m . The TSC checker has normal input set Xc = S^
and output code space Sc = {<01>,<10>3.
There are also two control gates which enable or disable 
the error indication from the network. With control lines 
<s2si> set to <01>, the network error indicator mimics the 
the checker outputs, but when the lines are set to <10>,
they force the error indicator to <10> (no error).
The output code space for a type 1 PSC network is Sc x S^ 
as with a TSC network. The normal input set of the 
network is vectors of the form <S2S^Xi>, where Xi is the 
functional circuit input. When functional circuit inputs 
from X^ . are expected, <323 )^ is set to <01> and the 
network is logically equivalent to the TSC network of 
Fig. 5.17. However, when inputs* are not from X^ ., the 
functional circuit output may not be a codeword, so 
*S2S1^ se^ which disables the checker. When
used in this manner the network of Fig. 5.20 is PSC.
Let F be the set of all single stuck-at faults on the a
control gates. The secure and tested fault sets of the 
network are the unions of Fa with the corresponding fault 
sets of functional circuit and checker. The secure input 
set of the network, Nn , is:
N = €<s0s1X.>|(<sos1>=<01>) and (X. 6 X f)3 (5.11)n 2 1 l > 2 1 i f
The network may also be operated with insecure inputs from 
the set N ':
N ' = £ < s0s,X.>I(<s0s,>=<10>) and (X.| Xf )3 (5.12)n 2 1 i > 2 1 i * f
Normally during its insecure mode of operation the func­
-128-
tional circuit will only receive inputs outside X^ ., but, 
in practice, any input could occur, so N ' is better 
defined as:
N ' = {<s,s,X.>|(<sos,>=<10>)3 (5.13)n z 1 1 1 z 1
A type 1 PSC network alternates between secure and 
insecure modes of op 
the network is thus:
eration. The normal input set Xn of
X = N U N ' (5.14)n n n
THEOREM 5.3 : The network shown in Fig. 5.20 and described 
above is PSC.
Proof : During its secure mode of operation, the network
is self testing and fault secure for the fault sets of the
functional circuit and checker (both TSC networks).
Therefore, the network is also self testing with inputs
from X since X D  N . This leaves the evaluation of self n n n
test and fault security for Fa . -
Self test : All faults except d SAO and a SA1 are tested 
by some input from N^. The faults d SAO and a SA1 are 
not detected since d=0 and a=l during the secure mode 
(constant lines), but they will be detected by some input 
from Nn ', since each changes the correct error indicator 
output from <10> to <11> or <00>. Thus all faults in F^ 
are tested by some input from Xn «
Fault security : A single fault from F causes, at most,
 1 1   u
a single bit change in the error indicator output, 
producing either the correct output or a non codeword.
Example 5.4 : A type 1 PSC network is the n-bit parity 
checked bus buffer shown in Fig. 5.21. The TSC functional 
circuit is the n-bit bus buffer of Fig 5.9. The TSC 
parity checker is based on a design to be described in 
Chapter 6. The control inputs <S2S^> are set to <01> for
-129-
the transmission of odd parity words and to <10> for words 
of unknown parity.
The application of type 1 networks is limited, since a re­
encoded functional output is not available during their 
insecure mode. However, in the example of Fig. 5.21, re­
encoded parity is always available from the parity checker 
at line P. Type 2 networks implement this concept.
5.5.2 : Type 2 Networks
A type 2 network is a type 1 PSC network which uses the 
TSC checker design described in section 5.4.2 and whose 
functional circuit output is a separable code. Fig. 5.22 
shows such a network.
A re-encoded functional output is provided by the check 
bit generator within the checker. The input sets, fault 
sets and output code space are the same as those for a 
corresponding type 1 network. Thus, ignoring the re­
encoded functional output, type * 2 networks are type 1 
networks with a specific TSC checker and hence they are 
PSC.
A non codeword output caused by a fault in the re-encoding 
operation is detected by the equality checker. These 
results are summarised in the following theorem.
THEOREM 5.4 : A type 2 network described above and
illustrated in Fig. 5.22 is PSC. In the absence of faults 
the re-encoded functional circuit output is always a 
codeword.
5.5.3 : Type 3 Networks
A disadvanatge of type 2 networks is that the functional 
circuit output is delayed by the check bit generator in 
the re-encoding process. In a TSC network or a type 1 PSC 
network, the functional output delay is due to the
-130-
functional circuit alone, whereas in a type 2 PSC network 
it is the sum of functional circuit and check hit 
generator delays. During its insecure mode, the re­
encoding process in any PSC network will always introduce 
delay, hut a type 3 network reduces this delay to that of 
a multiplexer during the secure mode.
Fig. 5.23 shows a type 3 network, consisting of a TSC 
functional circuit, a TSC equality checker and multiplexer 
which selects either the check hits of the functional 
circuit, or the output of the check generator as the 
network check bits.
The equality checker compares the network check hits with 
the generated check hits. When <S2S^> is <01> (secure 
mode) the network is logically equivalent to a TSC 
network. When <S2S^> is <10> (insecure mode) the func­
tional circuit output is re-encoded and the equality 
checker compares the generated check bits with themselves, 
producing a good output.
The normal input set, secure input set and output code
space of a type 3 network are the same as those of a type
1 or 2 PSC network. Let F be the set of all singlea
faults affecting the control gates except u^ SA1 (see Fig. 
5.23). The secure and tested fault sets of the network 
are the unions of F with the corresponding fault sets of 
the functional circuit and checker.
THEOREM 5.5 : A type 3 network described above and
illustrated in Fig 5.23 is PSC.
The proof for this theorem is similar to that for theorem
5.3 and requires a demonstration of the self testing and
fault secure capabilities of the network for F .a
Although type 3 networks avoid the delay associated with
a re-encoded functional output during their insecure mode,
they have two disadvanatges:
✓
-131-
1) They require more gates than a type 2 network with a 
corresponding increase in cost.
2) They have a set of single stuck-at faults for which 
the network is generally not self testing or fault
secure (u^ . SAD.
Wakerly E5.63, however, shows how to overcome the latter 
problem.
5.5.4 : A Partially Self Checking Logic Unit
Example 5.5 : The bit-slice (see section 5.7) of Fig. 
5.24a can perform sixteen different Boolean functions on 
two input variables a^ and b^, dependent on the setting 
of control inputs s^, s^, and s^. The functions
available are detailed in Fig. 5.24b. The circuit may be 
duplicated to form a bit-sliced functional circuit which 
can perform any of these operations on two input vectors 
A.. and such that for each value of r = <S£SgS2S^> the 
circuit performs a function Fr(A^,B^). If A^ and B^ 
are encoded in an error detecting code S with a Hamming 
distance of two, and if a function preserves that en­
coding (Aj,,Bi £ S and F ^ f A ^ B ^  £ S), then the functional
circuit is fault secure for all single bit-slice faults
from theorem 5.7. The secure input set N for F (A.,B.)r r i l
is:
N =£<s„s0s0StA.B.>I<s.s-s0St>=r and A.,B. £ S3 (5.15) r 4 3 2 1 1 1 *  4 3 2 1  l l
If the encoding is preserved by a number of r, say r £ R, 
then the secure input set N of the circuit is:
N = U N (5.16)
r £ R r
Consider S to be the set of even parity n-bit vectors, 
where n is even. This encoding is preserved by operations 
A± ©  Bi( ©  Bi, Ai( Bi, AT, IT, 0 and 1 (A., B. 6 S).
Fig. 5.25 details the single stuck-at faults in the bit-
-132-
lice (Fig 5.24a) which are detected by these operations 
C5.73. Inspection of Fig. 5.25 reveals that all single 
stuck-at faults are detected if both A^ ©  B^ and A^ ©  
are used. The same is true if all four operations A^, 
B^, A^ and B7 are used. If the functional circuit has an 
input set which contains any such set of code preserving 
operations then the circuit is self testing. The circuit 
is also fault secure for code preserving operations, so 
under these conditions it is TSC (secure mode). It can
therefore be used in a partially self checking network 
which re-encodes the output for those functions which are 
not code preserving.
The 74181 arithmetic logic unit/function generator C5.83 
contains four circuits similar to that in Fig. 5.24a. It 
can perform the logic functions of Fig. 5.24b or a set of 
arithmetic functions. Additional carry logic means that 
it is not a bit-sliced circuit, but if A^ and B^ are
encoded in an arithmetic error detecting code which has a 
Hamming distance of two, then their encoding is preserved 
by the addition and subtraction operations. These func­
tions will test the carry logic, and when combined with 
code preserving functions which test all stuck-at faults, 
will allow the 74181 to be TSC C5.73. This can then form 
the basis of a PSC arithmetic logic unit.
5.6 : SELF TESTING ONLY CIRCUITS AND NETWORKS
TSC and PSC circuits are self testing for a set of inputs
X and fault secure for a non-null subset N of X. N is a
null set for self testing only (STO) circuits.
Example 5.6 : Consider the 2 to 4 line decoder shown in 
Fig. 5.26. A 2-bit input vector Xj=<x2x^> is translated 
into a l-out-of-4 coded vector Y^=<y^y2y2y^>. The error 
indicator E^=<e2e^> is formed by ORing the outputs into 
two partitions:
ex = y4 + y1 (5.17)
-133-
(5.18)
In the absence of faults E. € £<01>,<10>1. The normal
1 2 input set is the set of 2-bit vectors 6 £0,1} , whilst
from (5.10) the output code space is the set of 10-bit
vectors Y^E..:
As usual, errors may be detected from observation of E^ 
alone.
Fig. 5.27 details the effect on and E^ of every single 
stuck-at fault in the circuit. Conclusions from this are:
1) The decoder alone is self testing and fault secure for 
all single stuck-at faults. It is also trivially code 
disjoint.
2) The checker alone is self testing and fault secure for 
all single stuck-at faults. It is not, however, code 
disjoint.
3) The overall network is self testing only for all 
single stuck-at faults. The secure fault set is null 
because a decoder output SA1 can produce a non code­
word which goes undetected. This occurs when y^ = 1
and y. is SA1, j 4 k, where y. is in the same output
K J
partition as y^, see Fig. 5.27.
If the output of the network was specified as E^ alone, 
then the fault in 3) would not be a problem, since the
checker output is the same with and without the fault.
However, using (5.19) means that this fault produces an 
incorrect codeword output from the network.
Y.E. 6 t0,lj4 X  t<01>,<10>} (5.19)
where during fault free operation is:
Tj. € cy4y3y2y1|yj € to,u, y4 + y3 + y2 + yx = u
(' +' is plus) (5.20)
-134-
The network is TSC though for all stuck-at-0 faults. This 
may be acceptable if the circuit is implemented with gates 
whose only failure is SAO. Note that faults in x  ^ and x, 
are not detectable since they create a different normal 
input. If it is neccessary to check these, then they need 
to be encoded C5.53.
Self testing only networks are formed by the interconnec­
tion of STO circuits, as shown in Fig. 5.28. The output of 
each STO block is either connected to the input of another 
block, is part of the network output, or both. Blocks 
which are part of the network output are termed output 
blocks (T^ and T^ in Fig. 5.28). The output code set of 
the network is the cross product of output block code 
spaces. The tested fault set of the network is the union 
of the fault set for each block. Sufficient conditions 
are given below for a network to be STO.
THEOREM 5.6 : A network which is an interconnection of 
logic blocks T^ is self testing only for a fault set 
Fj_= U Fj with normal input set X' if:
i) Each block is self testing for fault set F^ with 
normal input set X^ ., such that each input in X^ . occurs 
for some input in X. 
ii) Each block with inputs which are not network inputs or 
outputs is code disjoint.
Proof : It must be shown that there is a test for every
fault in F. = U F ..
t j j
For any j and any fault f in F^ there is a network input 
in X  such that the output of T^ is a non codeword from 
condition i). If the output of T^ is a network output, 
then the network output is also a non codeword and f 
tested. Otherwise the output of T^ . must feed the input of 
another block, T^ , which is code disjoint from condition 
ii) . Then the non code output from T^ produces a non code 
output from T^. This process continues until a non code
-135-
network output is produced. In the example of Fig. 5.28, 
only blocks T^, and need to be code disjoint.
5.7 : SELF CHECKING PROPERTIES OF BIT 
AND BYTE-SLICED CIRCUITS
A bit-sliced circuit is defined as a multi-output combin­
ational circuit, in which each output bit is computed by 
an independent subcircuit, called a bit-slice. A byte-
sliced circuit is similarly defined as a multi-output 
combinational circuit, in which each output byte is 
computed by an independent byte-slice. As microprocessor 
based systems are bus orientated, their buses are particu­
larly suited to bit or byte-slice processing. In order 
to provide a more general means of analysing the self 
checking properties, it is convenient to consider bit and 
byte-sliced circuits, because of their structure and their 
application.
5.7.1 : Bit-Sliced Circuits
The circuits of Fig. 5.9 and 5.11 are bit-sliced circuits, 
whereas the circuit of Fig. 5.13a is not. Remember that 
a circuit is self checking if it self testing for a fault 
set Fj. with input set X and fault secure for a fault set 
Fg with input set N.
Fault Security : The fault security of a bit-sliced
circuit is demonstrated by the following theorem.
THEOREM 5.7 : If S is an error detecting code with a
minimum Hamming distance of two, B a bit-sliced circuit
with fault free output function Z(N^), such that Z maps
the input set N to S, and Fg the set of all faults which
affects only a single bit-slice, then B is fault secure
for F with N. s
Proof: Any fault f in Fg affects only a single bit-slice 
and therefore only a single output. For an input , f
-136-
may or may not change that output bit. If it does not 
Z^(N^)=Z(N^) € S, but if it does, then the output is one 
bit different to the correct codeword and Z^(N^)  ^S.
Thus, a bit-sliced circuit is fault secure for all single 
bit-slice faults with all inputs that are mapped to
codeword outputs. The checker must be disabled for inputs 
outside N during normal operation. This is therefore a 
PSC network.
Self Test; The self testing property for all single faults 
depends on the bit-slice structure and normal input set X. 
However, the self testing capability of a bit-sliced 
circuit can be determined from the individual slices.
Consider a single bit-slice within a bit-sliced
circuit, as shown in Fig. 5.29. The input to is ,
so that its output is Z^CA^). Given X, a set of input 
vectors A^ to the slice B^ can be determined, see Fig. 
5.29.
Example 5.7 : The TSC multiplexer shown in Fig. 5.30 has 
an input vector <S2S^a^b^.> to each bit-slice. During
normal operation <S2S^> is set to <01> or <10>, so that
for every j, a^ . and b^ . should take on all four possible 
combinations. The set A^ for B^ is thus:
Aj = C<0100>,<0101),<0110),<0111),
<1000),<1001>,<1010>,<1011)1 (5.21)
Given A , then the tested fault set for the complete bit- 
sliced circuit can be determined.
THEOREM 5.8 : If the output of a bit-sliced circuit is
encoded in an error detecting code S, where S has a
minimum Hamming distance of two, and for each bit-slice
Bj there is a set of faults such that for an input
A^. in Aj, Z^(A^j)  ^ Z^(A^), then the bit-sliced circuit
is self testing for the fault set F. = U F. ..t j tj
-137-
Proof : For any fault f in F*..., there is a bit-slice
input A. . in A. due to circuit input X. in X such that
f ^ 3 3 i
Z.(A. .)  ^Z.(A. .). No other bit is affected by f, so the
3 1 3 3 ^ - 3 * ’
circuit output Z (X^) is different from Z(X^) € S by a
single bit and therefore  ^ S. The circuit is thus self
testing for any fault in F^ and self testing for any fault
in F, = U F. .. t . t]
The self testing ability of a bit-sliced circuit can be 
determined by considering each bit-slice separately. In 
many cases the bit slices are identical (for example 
the multiplexer of Fig. 5.30), as are the input sets A^
and fault sets A standard set of faults F^. together
with a standard set of inputs A from X can therefore be 
ascertained. F^ is determined by methods which identify
the set of faults detected for a particular test A^ in A.
Alternatively, F^ is deemed to be a known fault set and
every A^ in A examined to prove that every f in F^ is 
tested.
5.7.2 : Byte-Sliced Circuits
The results of theorems 5.7 and 5.8 for bit-sliced 
circuits can be extended to cover byte-sliced circuits 
with outputs encoded in a byte error detecting code. Such 
circuits are fault secure for all single byte faults. As 
with bit-slice circuits, their self testing capabilities 
are dependent on structure and normal input set.
Example 5.8 : Consider the byte-slice multiplexer for 8-
bit words shown in Fig. 5.31. The select input, s; deter­
mines whether A^ or B^ is transfered to the output. The 
transfer takes place when the enable input is at 0, 
otherwise an all 0 output occurs. If the input vectors are 
encoded in a byte error detecting code for 4-bit bytes 
such that the all zero vector (all outputs at 0) is a 
codeword, then the circuit is fault secure for all single
byte faults. This statement is independent of the struc- 
✓
-138-
of the quad 2 to 1 multiplexer. Examples of suitable byte 
error detecting codes are the b-adjacent code shown in 
Fig. 4.10, or the checksum code shown in Fig. 4.12. Self 
test is not independent of multiplexer structure so each 
individual byte has to be examined to determine the level 
of self test.
5.8 : SELF CHECKING SEQUENTIAL CIRCUITS
A model of a sequential machine is required to develop the 
definitions of self test and fault security given for 
sequential circuits at the beginning of the Chapter. The 
model depicted in Fig. 5.32 will be used which is 
specified by <V,Q,W, £,co> , where:
V is the set if input vectors.
Q is the set of states.
W is the set of output vectors.
S  is the next state function of the fault free machine,
S : V x Q Q'. 
o> is the output function of the fault free machine, 
to : Q —) W.
V contains all possible binary input vectors, W all 
possible output vectors and Q all possible states, regard­
less of whether they occur in normal operation. For fault 
detection purposes, the output of the circuit must be 
encoded in an error detection code, such that S C  W and 
non codewords detected. It is assumed that only the 
circuit output is observable; i.e. the state and input 
information is not observable, unless it is part of the 
output function.
5.8.1 : Fault Security
A sequential circuit fault may change the next state
function, the output function, or both. The next state
function of a machine in the presence of a fault is
denoted by $ ^ and the output function by o  ^ . Fault
security requires that output function failures must 
produce either the correct code output or a non code 
output. Next state failures must produce either the 
correct state or a non code output before an incorrect 
next state produces an incorrect code output. It is 
therefore possible for a sequential machine to be in 
several incorrect states before a failure is indicated,, 
as long as none of these incorrect states produces an 
incorrect code output. In practice, however, a transition 
to an incorrect state needs to be indicated immediately. 
This requires the set of states occupied by the fault free 
machine to be encoded in an error detecting code, P C  Q, 
and also requires the output function to be a code 
disjoint mapping from P to S.
DEFINITION 5.14 : A sequential circuit is fault secure for
a fault set Fs with input set N (N C V), if, V N^ 6 N,
V f 6 F and V P. € P, either : s 1
i) Sf (N.,P.) = S(N.,P.) and o>f (Sf (NT, PT ) ) = u> (S(Nt ,PT ) ) l i 1 1  1 1  1 1
or
ii) cuf (Sf(iq,Pj>) 4 S
This definition, illustrated in Fig. 5.33, requires that 
all faults either produce no effect on the next state and 
output function for a particular input, or produce an 
immediate error indication. Visiting several error states 
whilst maintaining the correct output before a detectable 
error is not allowed. (Diaz, however, in his definition 
could allow this behaviour C5.93.) The above definition 
also requires that a fault affecting both the next state 
and output functions must still give an error indication. 
When logic is shared between both functions this error 
indication is difficult to guarantee, so shared logic must 
be avoided. The definition of fault security does not 
require an error indication to be maintained; i.e. once a 
faulty circuit produces a non code state and indicates an 
error, the definition does not prevent it from returning 
to an incorrect code state. Many sequential machine
-140-
designs prevent this behaviour by 'error trapping', 
whereby the machine maintains a non code state if one is 
entered.
The general block diagram of Fig. 5.32 may be modified to 
give the fault secure machine in Fig. 5.34. The current 
state is held in a state register implemented with 
(say) D type flip-flops fed by a common clock signal. It 
is assumed that the clock signal is checked separately, 
see Chapter 6. The state register is updated with the 
output of a circuit which computes S(N^,P^) and the 
machine output is produced byo(P..). The state vector P^ 
should be from an error detecting code P so that the state 
register is fault secure. If, for example, P is a single 
bit parity code, then the state register is fault secure 
for all faults which affect a single bit. The next state 
circuit must also be fault secure. This is accomplished 
by using an independent subcircuit for each bit. Finally, 
the output circuit should be a fault secure, code disjoint 
mapping from state code P to output code S, so that P^€ P 
becomes € S.
An m-out-of-n code is equally suitable for encoding the 
states of a fault secure machine. This has the advantage 
of detecting undirectional errors.
Example 5.9 : Consider the state diagram, state table and 
logic diagram given in Fig. 5.35 for a synchronous 
sequential machine. Output y is set when three consecutive 
l's are received by input x and reset by the next 0 at x 
thereafter. The design has four states and requires two 
flip-flops. A fault secure version of the machine is 
achieved by encoding the states in a single parity code 
and encoding the output in a l-out-of-2 code, as shown in 
Fig. 5.36. The machine still has four states, but now 
requires three flip-flops because of the additional parity 
'bit' and a second output to support the l-out-of-2 
encoding. The machine is now fault secure for a single 
gate or flip-flop fault.
-141-
5.8.2 : Self Test
A sequential machine is self testing if every fault is 
tested during normal operation. If P C  Q is the set of 
states during normal operation and X C v  is the set of 
normal inputs, then one way to define self test would be 
the existence of a state in P and an input X^ in X 
for each fault f in Ffc, such that co^  ( (XT ,P7) ) ^ S L5.93. 
However, this definition is not adequate since the inputs 
and states of the machine may be correlated and there is 
no guarantee that a particular input/state pair occurs in 
practice. For example, input strings with only two or 
three consecutive l's will not visit the states marked 
with a in Fig. 5.36. Therefore, the set XP of normal 
input/state pairs has to be defined, where <X^,P^> is in 
XP if Xi occurs with state Pi during normal operation. 
This allows a more satisfactory definition of self test­
ing sequential circuits to be made C5.63.
DEFINITION 5.15 : A sequential circuit is self testing for 
a normal input/state set XP and fault set F^ _, if, 
V f 6 F. , 3 <X.,P. > G XP, such that i/ (gf (XT,P7) 4 S.
Z 1 1  1 1 *
This definition is illustrated in Fig. 5.37. The sequen­
tial machine in Fig. 5.34 is self testing if the next 
state function, state register and output function are all 
tested by normal inputs. Self testing is not as easy to 
achieve as fault security. The design of example 5.9 in 
Fig.5.36, for instance, is not self testing for SAO faults 
on the inputs of its 3-input AND gate, even if all input/ 
state pairs occur. Requirements for the self testing of 
seqential machines are discussed by Diaz C5.9,5.103 and 
Ozguner C5.113.
5.9 : LITERATURE REVIEW
Carter and Schneider laid the foundations for self testing 
circuit theory in their historic 1968 paper C5.13, with a
-142-
model for dynamically checked logic. They also presented 
designs for parity and l-out-of-2 code checkers, which 
have since been universally adopted as a basis for TSC 
checkers. Carter in conjunction with other IBM personnel 
subsequently expanded these ideas from the view point of 
fault tolerant computing C5.12-5.143 and Morphic Boolean 
functions C5.43. Carter summarised many of these concepts 
in his excellent 1974 paper C5.53. The most recent 
documentation from the team has been a fault tolerant 
memory system C5.153 and a self checking computer design 
C5.163. The latter is a significant development from 
their first paper.
In his thesis, Anderson C5.23 formalised the initial ideas 
of Carter, with definintions for TSC circuits and net­
works, plus a discussion on how to test them. He also
presented an important design technique for TSC m-out-of-n 
checkers, which was extended in a later paper C5.173. 
Many papers have been written on the design of TSC code 
checkers and these will be detailed in the next Chapter. 
Kolupaev C5.183 has more recently considered TSC networks, 
with modifications to the fault secure and self testing 
definitions proposed by Anderson.
Using a more mathematical approach, Smith C5.193, in his 
thesis,detailed TSC circuits for single and unidirectional 
faults, as well as considering their behaviour for un­
modelled faults. A subset of this work appeared in C5.203, 
with further papers discussing strongly fault secure logic 
networks C5.213 and TSC system design C5.223. Self 
checking systems are the subject of Chapter 7, prompted by 
the work of Moreira de Souza et al C5.233. Strongly fault 
secure systems are also considered by Nanya C5.243, whilst 
a design technique for TSC circuits with unrestricted 
stuck-at faults using redundancy in time and space has 
been proposed by Rao C5.253.
In 1974 Wakerly introduced the concept of PSC circuits and 
networks C5.73, which subsequently formed part of his book
-143-
C5.63. In the book he also used modified definitions of 
the fault secure and self testing properties. The relia­
bility of PSC circuits has been discussed by Gay C5.263 
and a signal reliability evaluation of self checking 
networks proposed by Hong C5.273
The majority of the papers referenced so far are concerned 
with combinational circuits only, but Diaz has presented a 
number of papers on the design of sequential TSC circuits 
C5.9,5.10,5.283, along with Smith C5.223, Nanya C5.293, 
Viaud C5.303 and Ozguner C5.113. The code disjoint 
property has been discussed by Jansch C5.313 for sequen­
tial circuits and by Nicolaidis C5.323 for combinational 
circuits.
The construction of TSC circuits using programmable logic 
arrays (PLAs) has been widely investigated C5.33,5.343, 
particularly for checkers (see Chapter 6), whilst the 
specific problems associated with TSC circuit implementa­
tion in technologies such as MOS C5.35,5.363, CMOS C5.373 
2
and I L C5.38,5.393 have also been examined.
Specific examples of self checking circuits include a self 
checking form of the linear feedback shift register 
(mentioned many times in Chapter 4), developed by Lu 
C5.403, a self checking periodic signal checker, proposed 
by Usas C5.413 and a self testing decoder described by 
Carter C5.423.
All of the above papers detail various aspects of self 
checking hardware, but self checking software also needs 
to be considered C5.433.
-144-
5.10 : REFERENCES
5.1) Design of dynamically checked computers. - W. C. 
Carter, P. R. Schneider; Proc. IFIPS Congress, 
Vol. 2; Edinburgh; August 1968; North-Holland 
Publishing Company, Amsterdam (1969); pp. 878-883.
5.2) Design of self-checking digital networks using 
coding techniques - D. A. Anderson; Coordinated 
Science Laboratory Report R-527; September, 1971; 
University of Illinois, Urbana, Illinois, USA.
5.3) The error latency of a fault in a combinational 
digital circuit - J. J. Shedletsky, E. J. McClus- 
key; FTCS-5*; pp. 210-214.
5.4) Computer error control by testable morphic Boolean 
functions - A way of removing hardcore - W. C. 
Carter, A. B. Wadia, D. C. Jessep; FTCS-2*; pp. 
154-159.
5.5) Theory and use of checking circuits - W. C. Carter; 
Infotech*; pp. 415-454
5.6) Error detecting codes, self checking circuits and
applications - J. F. Wakerly: Elsevier - North
Holland; New York; 1978; Chapter 3, pp. 54-99.
5.7) Partially self-checking circuits and their use in 
performing logical operations - J. F. Wakerly; IEEE 
Trans. Coraput.; Vol. C-23, No. 7; July, 1974; pp. 
658-666.
5.8) The TTL book for design engineers - The engineer­
ing staff of Texas Instruments components group; 
Texas Instruments; Fourth European edition; 1980; 
pp. 7:271-7:281.
5.9) Design of totally self-checking and fail-safe 
sequential machines - M. Diaz; FTCS-4*; pp. 19-24.
5.10) Unified design of self-checking and fail-safe 
combinational circuits and sequential machines - M. 
Diaz, P. Azema, J. M. Ayache; IEEE Trans. Comput.; 
Vol. C-28, No. 3; March, 1979; pp. 276-281.
5.11) Design of totally self-checking asynchronous and 
synchronous sequential machines - F. Ozguner; FTCS- 
7*; pp. 124-129.
5.12) Logic design for dynamic and interactive recovery 
- W. C. Carter, D. C. Jessep, A. B. Wadia, P. R. 
Schneider, W. G. Bouricius; IEEE Trans. Comput.; 
Vol. C-20, No. 11; November, 1971; pp. 1300-1305.
5.13) A theory of design of fault-tolerant computers 
using standby sparing - W. C. Carter, W. G. Bouri­
cius, D. C. Jessep, J. P. Roth, P. R. Schneider, A.
-145-
B. Wadia; FTCS-1A; pp. 83-86.
5.14) A theory of design of fault-tolerant computers 
using standby sparing - W. C. Carter, W. G. Bouri- 
cius, D. C. Jessep, J. P. Roth, P. R. Schneider, A. 
B. Wadia; IBM Research Yorktown; RC 3261; February, 
1971; pp. 1-11.
5.15) Implementation of an experimental fault-tolerant 
memory system - W. C. Carter, C. E. McCarthy; IEEE 
Trans. Comput.; Vol. C-26, No. 6; June, 1976; pp. 
557-568.
5.16) Cost effectiveness of self checking computer 
design - W. C. Carter, G. R. Putzola, A. B. Wadia, 
W. G. Bouricius, D. C. Jessep, E. P. Hsieh, C. J. 
Tan; FTCS-7*; pp. 117-123.
5.17) Design of totally self-checking check circuits for 
m-out-of-n codes; D. A. Anderson, G. Metz; IEEE 
Trans. Comput.; Vol. C-22, No. 3; March, 1973; pp. 
263-269.
5.18) Cascade structure in totally self-checking net­
works - S. Kolupaev; FTCS-7*; pp. 150-154.
5.19) The design of totally self-checking combinational 
circuits - J. E. Smith; Coordinated Science Lab- 
oratoryReport R-737; August, 1976; University of 
Illinois, Urbana, Illinois, USA.
5.20) The design of totally self-checking combinational 
circuits - J. E. Smith, G. Metz;FTCS-7*;pp. 130-4.
5.21) Strongly fault secure logic networks - J. E. Smith, 
G. Metz; IEEE Trans. Comput.; Vol. C-27, No. 6, 
June, 1978; pp. 491-499.
5.22) A theory of totally self checking system design 
- J. E. Smith, P. Lam; IEEE Trans. Comput.; Vol. 
C-32, No. 9; September, 1983; pp. 831-844.
5.23) A research oriented microcomputer with built in 
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
5.24) Error secure/propagating concept and its applica­
tion to the design of strongly fault secure pro­
cessors - T. Nanya, T. Kawamura; FTCS-15*; pp. 396- 
401.
5.25) Design of totally self-checking circuits with an 
unrestricted stuck-at fault-set using redundancy in 
space and time domains - K. V. S. S. P. Rao, D. 
Basu; IEEE Trans. Comput.; Vol. C-32, No. 5; May, 
1983; pp. 464-475.

















- F. A. Gay; FTCS-7*; pp. 135-142.
Signal reliability evaluation of self checking
networks - K. K. Hong, Y. Thoma; FTCS-10*; pp. 257- 
262.
Design of self-checking microprogram controls - M. 
Diaz, J. Moreira de Souza; FTCS-5*; pp. 137-142.
Design of self-checking asynchronous sequential 
machines - T. Nanya, Y. Tohma;FTCS-10*;pp. 278-280.
Sequentially self-checking circuits; J. Viaud, R. 
David; FTCS-10*; pp. 263-268.
Strongly language disjoint checkers - I. Jansch, 
B. Courtois; FTCS-15*; pp. 390-395.
Strongly code disjoint checkers - M. Nicolaidis, 
I. Jansch, B. Courtois; FTCS-14*; pp. 16-21.
Design and application of self-testing comparators 
implemented with MOS PLA's - Y. Tamir, C. H.
Sequin; IEEE Trans. Comput.; Vol. C-33, No. 6;
June, 1984; pp. 493-506.
The design of totally self-checking circuits using 
programmable logic arrays - S. L. Wang, A. Avizien- 
is; FTCS-9*; pp. 173-180.
Techniques for efficient MOS implementation of 
totally self-checking checkers - N. K. Jha, J. A. 
Abraham; FTCS-15*; pp. 430-435.
Analysis of a class of totally self-checking 
functions implemented in a MOS LSI general logic
structure - M. W. Sievers, A. Avizienis; FTCS-11*; 
pp. 256-261.
Totally self-checking CMOS circuits using a hybrid 
relalization - N. K. Jha, J. A. Abraham; FTCS-15*; 
pp. 154-158.
2
Multivalued I L circuits for TSC checkers - D. 
Etiemble; FTCS-9*; pp. 181-184.
2
Threshold I L totally self checking circuits - Y. 
W. Yang, S. M. Liu, T. C. Chen; FTCS-10*; pp. 284- 
287.
Self-checking linear feedback shift registers - D. 
J. Lu; FTCS-10*; pp. 269-271.
A totally self-checking checker design for the 
detection of errors in periodic signals - A. M. 
Usas; IEEE Trans. Comput.; Vol. C-24, No.5; May, 
1975; pp. 483-489.
-147-
5.42) A simple self-testing decoder checking circuit 
- W. C. Carter, K. A. Duke, D.C. Jessep; IEEE 
Trans. Comput.; November, 1971; pp. 1413-1415.
5.43) Self stabilising programs : The fault-tolerant
capability of self-checking programs - A. Mili; 
IEEE Trans. Comput.; Vol. C-31 No. 7; July, 1982; 
pp. 685-689.




(CODE IS  ERROR DETECTING)










> - ERROR SIGNAL
















OUTPUT VECTOR=Z ( X i )




VECTOR SA S i6S
(c) SELF CHECKING COMBINATIONAL CIRCUIT




'• z (Ni) e s » A 
•
any Nj 6 N» any f £ F§ 2 x ( N i )  = Z ( N i )  e  S
orZ
FIGURE 5.4 A FAULT SECURE CIRCUIT
any Xi 6 X , fault free
some Xi 6 X t 2f (Xi) $ s





Ft C  F











any Xi €» S '! 
(ie.S'i) I
2 ( X i  ) e S S ' i  6 s 
S'  c  ur
s c  u




(S and S' are both 
error detecting codes)
FIGURE 5.7 A CODE DISJOINT CIRCUIT
any Xi .6 X •
some Xi &  X
fault free
any f e Ft self test </
some Xi £  X any f 6 Ff . Zf(Xi ) = Z(Xj ) 6  s|j*i
incorrect 
codeword input
fault security X 
Fs=d


















n * 4, 1 parity bit k = 5 
R = result status:
a) * = correct codeword output.
b) E = non codeword output (even parity codeword)
c) X = incorrect codeword.
Correct outputs » inputs.
Input fault a corresponding output with same fault. 
a1,a2,a3,a ,a5 SA1 <y«;yAy,y,y1 > permanently 111115 M J3J2Jr





























Co ADDER Ci C° ADDER Ci co ADDER ci
(a)
s = a © b © c i cQ = ab + c ^ ( a @ b )










































































VXf6X, Vf6F, then B.is unaffected or i  B SnS x r






















FIGURE 5.17 A TOTALLY SELF CHECKING NETWORK
-155-





TOTALLY SELF CHECKING EQUALITY CHECKER
ERROR INDICATOR
FIGURE 5.18 A TOTALLY SELF CHECKER FOR SEPARABLE CODES
LOW COST CODE












































110 \ 4 c 
111)




































< S2 81 >  s, 
<01> ODD
PARITY

















TOTALLY SELF CHECKING 
EQUALITY CHECKER










NON CODEWORD OUTPUTS S I ►— — —
FROM FUNC. CIRCUIT





DATA BITS DATA BITSINPUTS
exr CHECK BITS 1
CHECK BITS
< S2 Sl>
<01>RE-ENCODED CHECK BITS 
<10/ORIGINAL CHECK BITS
CHECK BIT









TOTALLY SELF CHECKING EQUALITY CHECKER
OUTPUTS
_ es„











rs<s4s3s2si> w v Fr(Ai'V
0 0 0 0 10 0 0 + B1
0 0 0 1 Ai + B1 10 0 1 Aj ®  B1
0 0 10 ri • Bi 10 10 B1
0 0 11 0 10 11 Ai * Bi
0 10 0 A1 • Bi 1 1 0  0 1
0 10 1 h 1 1 0  1 Al + Bi
0 110 Aj ®  Bi 1 1 1 0 Ai + Bi
0 111 A1 • r i 1 1 1 1 Ai
FIGURE 5.24 A BIT-SLICE TO PERFORM 16 LOGIC FUNCTIONS
OUTPUTS
ON TWO VARIABLES 
-158-
FUNCTION NODES STUCK-AT-0
Fr(Ai V ai al bi b4 30 al s2 s3 P1 p2 P3 p4 zi
0 X X X X
1 X X X X X
A1 ©  B. X X X X X X X X X
Aj ©  B1 X X X X X X X X
Ai X X X X X X
X X X X
Bi X X X X X X X
^i X X
X X X X X X X
FUNCTION








81 a2 a3 ql *2 zi
0 X X X
1 X X X X
Aj © X X X X X X X X X X X
Aa ©  B1 X X X X X X X X X X
Ai X X X X
Bi X X X X X X X X X
Ai X X X X X X X
X X X X X X X X X
Notes : 1) X indicates a detected failure.
2) Only one member from each class of structurally 
equivalent faults is Included.
FIGURE 5.25 FAULT TESTED BY A SUBSET OF FUNCTIONS




A 2 TO 4 LINE DECODER WITH CHECKERFIGURE 5.26
-159-
KEY
NORMAL (FAULT FREE) OPERATION
xi ef gh U kl
Yi Ei
x2 X1 y4 7 3 y2 yl e2 C1
0 0 00 01 01 11 0 0 0 1 0 1
0 1 01 00 11 10 0 0 1 0 1 0
1 0 10 11 00 01 0 1 0 0 1 0
1 1 11 10 10 00 1 0 0 0 0 1
-- -- DECODER AND CHECKER OUTPUTS AS NORMAL
y4y 3y2y i e2el
ERROR IN DECODER OUTPUTS 
DETECTED BY CHECKER
--
e2el ERROR IN CHECKER OUTPUTS
y4y3y2yl —
ERROR IN DECODER OUTPUTS 
NOT DETECTED BY CHECKER
OPERATION WITH SINGLE STUCK-AT FAULTS



































- -  0000 00
1100 11































































"s a t SAO
- -  nil - -  oo
11 00
SA1






FIGURE 5.28 A SELF TESTING ONLY NETWORK
B IT  SLICED CIRCUIT
INPUT
x.exi







Aij e A j“*’
B IT SLICE Bj
• -
INPUT ,
A .. 6  A - *" 
ik k-».
•
B IT SLICE Bk
. OUTPUT =Z , (A111
OUTPUT = Z j ( A i ^)
_► OUTPUT =Z (A ) 
r k lk





































FIGURE 5.31 A BYTE-SLICED MULTIPLEXER
MEMORY CO
FIGURE 5.32 A SEQUENTIAL MACHINE MODEL









s p pi = s ( n - ,p-:




if memory Pi wf





p < - * (n t pi )
w(p. )e S 
=tt(&(Nf,Pp)
(SC W)
u f (Sf (Nf.Pp)
_^ u(S(N£- ,pi) )es @
L
Of(Sf(NT PT))^S @














FIGURE 5.34 A FAULT SECURE SYNCHRONOUS SEQUENTIAL MACHINE
0/0 Q  1/° n 0^9^---- ► 92 ---








NEXT STATE OUTPUT y
0 1
0 1
q2 q i q2 q i
Q 1 0 0 0 0 0 1 0 0
q 2 o 1 0 0 1 0 0 0
*3 1 0 0 0 1 1 0 0










P = (even parity 
codewords )






0 1 0 1
P'3 ^ 2 ^ 1 P% P2 pl y2 yl y2 yl
0 0 0 0 0 0 1 0  1 1 0 1 0
p 1 0  1 
2
0 0 0 * 1 1 0 1 0 1 0
p 1 1 0
*3
0 0 0 O i l 1 0 1 0
p O i l  
*4
0 0 0 0 1 1 * 0 I 0 1
(b) Logic Diagraa













Some X j 
(and P^)
e x  p
memory 
any f e Ft any f 6
S f
any f e Ft
FIGURE 5.37 A SELF TESTING SEQUENTIAL CIRCUIT
-164-
CHAPTER SIX : THEORY AND DESIGN OF SELF CHECKING CIRCUITS
6.1 : INTRODUCTION
Chapter 5 has defined the various types of self checking 
circuit and discussed their properties. This Chapter 
develops that theory in order to design a number of 
totally self checking (TSC) circuits.
There is still no universally adopted technique for 
designing self checking circuits, so it is often necessary 
to form a self checking network out of a number of smaller 
blocks, whose self checking abilities are easily demon­
strated. The minimal attention paid by researchers in 
this field to developing any generalised theory has not 
been helped by many of the papers referenced so far, which 
present designs, yet give insubstantial explanations of 
any fundamental theory, their design procedures and how to 
readily demonstrate that the designs are in fact self 
checking. This Chapter attempts to remedy that situation 
with a detailed development of self checking analysis and 
design from the fault testing requirements of logic gates.
Most research has been devoted to self checking checkers 
(see section 5.4), a circuit which effectively converts a 
specific input code to a l-out-of-2 output code, see Fig. 
6.1a. There is also a requirement for self checking 
circuits with more than one encoded input, illustrated in 
Fig 6.1b, and self checking circuits with uncoded inputs, 
illustrated in Fig. 6.1c. Throughout the remaining 
Chapters only self checking circuits which output a l-out- 
of-2 code are considered, for reasons given in section 
5.4.1.
It is shown in section 6.5 that a circuit with a 1-out-of- 
2 code output is inherently fault secure if each output is 
computed with an independent subcircuit. On this basis 
only the self testing capabilities of the circuit need to 
be evaluated for self checking purposes. A number of
-165-
theorems are proposed in sections 6.2 to 6.4 which specify 
the tests required to detect all single stuck-at faults in 
single and cascaded AND, OR, EXOR and inverter gates. 
Section 6.4 also introduces the concept of merged test 
sets, which allow simultaneous testing of more than one 
gate. The self testing property requires that all the 
necessary tests are generated within a circuit at some 
point during normal operation from the application of 
codeword inputs only. This aspect is discussed in section 
6.5.
Section 6.5 applies the theory of section 6.4 to design 
totally self checking checkers using Karnaugh maps C6.13. 
Two theorems specify the conditions for a 2-level AND/OR 
or OR/AND structure to be self testing and two corollaries 
show how a Karnaugh map representation of the circuit is 
used to determine if these conditions are met. Section 6.6 
implements this design procedure for an n-input comparator 
(equality checker), where the input pairs are initially
l-out-of-2 encoded. This circuit appears (mainly for n=2) 
in virtually any discussion of' TSC circuits, but its 
design has never been adequately documented. An often 
quoted formula for this design is also derived. Section 
6.7 again uses the theory of section 6.4 to design a TSC 
parity code checker.
The design of a TSC 1-out-of-n code checker is discussed 
in section 6.8, with specific reference to a 2 to 4 line 
decoder and its l-out-of-5 checker. Finally, section 6.9 
describes a TSC periodic signal checker. All of these 
designs are subsequently employed in the self checking 
computer of Chapter 9.
6.2 : TESTING FOR FAILURES IN AND, OR AND INVERTER GATES
Consider a 2-input AND gate which has four input combina­
tions and six single stuck-at faults. Fig. 6.2 details 
the faults detected by each input combination. A failure 
is detected by an incorrect output, so the output stuck-
at-0 (SAO) is detected by an input combination which
normally produces a 1 output (<11>) and the output stuck- 
at-1 (SA1) is detected by an input combination which
normally produces a 0 output (<00>, <01> or <10>). An 
input SAO is detected by combinations which apply a 1 to 
that input and levels to the other inputs such that the 
input failure produces an incorrect output. Thus <11> 
detects inputs A or B SAO in a 2-input AND gate (see Fig.
6.2). Similarly an input SA1 is detected by combinations 
which apply a 0 to that input and produce an incorrect 
output due to the failure. These are <01> for A SA1 and
<10> for B SA1 in a 2-input AND gate. A stuck-at input
will not be detected by applying the stuck-at level to it 
and a stuck-at output will not be detected by an input 
combination which produces the stuck-at level as an 
output. Fig. 6.3 details the input combinations to a 2- 
input OR gate and the single stuck-at faults they detect.
Using this information, it is desirable to ascertain the 
minimum test set T for each gate. However, this is more 
clearly observed from the fault tables given in Fig. 6.4 
for 3-input AND and OR gates. From this table two 
theorems are proposed.
THEOREM 6.1 : Every single stuck-at fault in an n-input 
AND gate is detected with the application of a single 0 to 
every input (n combinations) and all inputs at 1, a total 
of n+1 input combinations.
Proof s The only test for an input SA1 is to apply a 0 to 
that input and a 1 to all others. For n inputs there must 
be n such tests, a single 0 applied to each input. All 
these tests produce a 0 at the gate output during fault
free operation and a 1 when an input SA1 is detected.
They will therefore also detect an output SA1 failure.
The only test for an input SAO is all inputs at 1. Any 
input SAO will be detected by this test, which produces a 
1 output during fault free operation and a 0 output with a 
single (or multiple) input SAO failure. The test there­
-167-
fore also detects an output SAO failure. Thus all single 
stuck-faults are detected by these n+1 tests. This set 
of tests is defined as T&, so that T = T&.
THEOREM 6.2 : Every single stuck-at fault in an n-input OR 
gate is detected with the application of a single 1 to 
every input (n combinations) and all inputs at 0, a total 
of n+1 input combinations.
Proof s Using Fig. 6.4. the proof for this theorem is 
similar to that for theorem 6.1. This set of tests is 
defined as T+, so that T = T+.
Although self evident, the testing requirments for an 
inverter gate are given for completeness by the following 
theorem.
THEOREM 6.3 : Every single stuck-at fault in an inverter 
gate is detected with application of a 0 and 1 to its 
input.
Proof : Fig. 6.5 gives the fault table for an inverter 
gate. This shows that a 0 input will detect input SA1 and 
output SAO failures, whereas a 1 input will detect input 
SAO and output SA1 failures; i.e. both input combinations 
are required to fully test the gate. This set of tests is 
defined as T', so that T = T'.
6.3 : TESING FOR FAILURES IN EXCLUSIVE-OR GATES
The procedures for testing stuck-at failures in an 
Exclusive-OR (EXOR) gate are somewhat different to those 
in section 6.2 and hence are dealt with separately. Fig. 
6.6 gives the fault tables for 2, 3 and 4-input EXOR
gates. From these it can be seen that there are 2 n ~ ^  (n 
is the number of inputs) combinations which will test for 
every single stuck-at failure. Half of the inputs which 
detect an input stuck-at failure produce a 1 output from 
the gate during fault free operation and the other half
-168-
produce a 0 output. It is therefore much more difficult 
to derive a minimum test set. Any input combination and 
its inverse will detect all input SA faults in all gates, 
for example <000> and <111> when n = 3. These two 
tests will also detect both output failures in gates
with odd n (n=3 in Fig. 6.6). They will not do the same
for gates with even n because both produce the same output 
(n=2 and n=4 in Fig. 6.6). In this case a third test is 
necessary which produces an output opposite to that of the 
first two. From this discussion the following theorem is 
proposed.
THEOREM 6.4 : Every single stuck-at failure in an n-input 
EXOR gate is detected with the application of input
combinations which provide a 0 and a 1 to every input. 
There must also be at least one input combination which 
has an odd number of l's and at least one input combina­
tion which has an even number of l's.
Proof : Single input SAO failures are detected by applying 
a 0 to every input and single - input SA1 failures are 
detected by applying a 1 to every input. The combinations 
are therefore not as restrictive as those for AND and OR 
gates. They can range from a single 0 and a single 1
applied to all inputs to the minimum set of two tests (any 
combination and its inverse). The only restriction is 
that all inputs receive a 0 and a 1. This is possible 
since, given any input combination, a single change in 
this combination due to a fault will produce a change in 
the output and will thus be detected. The requirement for 
at least one input combination which has an odd number of 
l's and at least one input combination which has an even 
number of l's ensures that the output of the gate will be 
a 1 and 0 respectively, so that both output faults can be 
detected. This set of tests is defined as T ©  .
6.4 : TESTING FOR FAILURES IN CASCADED GATES
So far, the tests for single stuck-at faults have been for
-169-
gates in isolation. In practice gates will be cascaded, 
so the effect this has on testing them for failures must 
be considered. The requirements of theorems 6.1-6.4 must 
still be met for each gate, to ensure that all failures 
within them can be detected. Assuming that this is the 
case, a detected failure within the circuit needs to be 
propagated to its output(s) so it can be detected extern­
ally by an error indicator or checker. Theorem 6.5 gives 
the conditions for this to occur.
THEOREM 6.5 : Every test t 6 T applied to a single gate,
where T is the test set for that gate (and given by
theorems 6.1-6.4), has an output y^, the level it occupies
with the fault present, which is the inverse of its normal
output. All single stuck-at faults in the circuit
consisting of a gate whose output y is connected to an
input x^ of a second gate can be detected at the output
of the second gate, if V t € T^ are applied to the first
gate, where T^ is T for that gate, and for at least one
occurrence of each t there must be an input combination to
the second gate which tests for input x. stuck-at-yf, 
f f i t .
where yfc is y for test t. In addition V t € must be 
applied to the second gate, where is T for that gate.
Proof : Consider a 3-input AND gate feeding a 3-input OR 
gate, as shown in Fig. 6.7a. A fault in the AND gate
needs to be propagated to the output of the OR gate. With
the application of V t 6 T^ = T&, a fault in the AND gate 
is detected when its output is opposite to that expected. 
It is this faulty level, d^, which must be propagated to 
output h. There is a d^ for each t, i.e. d£, so that for
at least one occurrence of each t (t could occur more than
once) there must be an input combination to the OR gate 
which detects f stuck-at-d^. For example, in Fig. 6.7a,
a = b = c = 1 tests for a, b, c or d SAO. During fault
free operation d=l, but for any of these faults d=0. To 
propagate this fault the OR gate must have the appropriate 
input combination to detect f SAO, i.e. a single 1 on f, 
<efg> « <010>. From Fig. 6.7b the four conditions to
-170-
detect and propagate all faults in the AND gate require 
<efg> = <000> and <010>. However the OR gate requires 
V t € = T+ to fully test it, so <efg> = <100> and <001>
must also occur.
Example 6.1 : The structure of Fig. 6.7a is expanded in 
Fig. 6.8a to give a complete two level AND/OR structure. 
There are now three 3-input AND gates to test, so the 
procedure adopted in Fig. 6.7b to test for a fault and 
propagate it is detailed for each AND gate in turn.
The fault propagation requirement means that whilst one
AND gate is being tested, the other two must have any
input combination except <111> so that they output a 0,
denoted by X in Fig. 6.8a. As a result, the required
OR gate input combinations automatically satisfy the
conditions to fully test this gate, see Fig. 6.8b. From
Fig. 6.8b there are twelve conditions which must be 
*
satisfied to fully test the circuit for all single stuck- 
at faults. However, this can be reduced to a minimum of 
six tests as follows. The tests*for a SA1 fault in any 
AND gate (inputs <011>, <101> and <110>) all require the 
OR gate input to be <000>. These tests produce a 0 output 
from the tested gate during fault free operation, so they 
can be merged without affecting the fault propagation 
requirements. This allows all AND gates to be simultan­
eously tested for SA1 faults. Any single SA1 fault in any 
of the three AND gates will then produce a 1 output from 
that gate and a change in the OR output from 0 to 1. The 
tests can be merged in any combination. Fig. 6.8c gives 
two examples. The tests for SAO failures in the AND gates 
cannot be merged in a similar manner, since the propaga- 
ion requirement allows only the tested AND gate output to 
be a 1. SAO tests cannot be merged with a SA1 test 
either, as a detected fault in the former would not be 
propagated, since the OR gate already has an input at 1 
due to the latter. (Another due to a SAO fault will not 
cause a change at its output.) If the SA1 tests are fully 
merged, the circuit in Fig 6.8a requires a minimum of six
-171-
tests, as indicated in Fig. 6.8c.
Fig. 6.9 extends the concepts of Fig 6.8 to give the test 
requirements for various cascaded OR/AND, AND/AND, OR/OR 
and mixed structures. These structures are totally self 
checking for all single stuck-at faults if all the first 
level gates are fully tested. An n-input AND or OR gate 
requires n+1 tests to fully test it (from theorems 6.1 and
6.2), so the total number of unmerged tests Tu for the 
2-level structures in Fig. 6.9 is given by:
m
T = I (n. + 1) : where n. is the number of inputs for
i = 1 1gate i and m is the number of first
level gates. (6.1)
To calculate the minimum number of tests which occur in a 
fully merged test set, consider 2-level AND/OR and OR/AND 
structures only, with m x n-input first level gates. The 
merging process is applied to SA1 tests for the AND/OR 
structures and SAO tests for the OR/AND structures. In 
either case these tests constitute n of the n+1 tests for 
each gate (see proof of theorem 6.1). In a fully merged 
test set these n tests for m-1 of the first level gates 
are combined with those for the remaining gate. For the 
circuit conditions given above, Tu= m(n+l), so the number 
of fully merged tests T^ is then given by:
T = m(n+l) - (m-l)n = m+n (6.2)m
Applying equation (6.2) to example 6.1, for which m=3 and 
n=3, gives T^=12 and Tm=6, as expected.
Note that for AND/AND and OR/OR structures the only tests 
which can be merged are the SAO and SA1 tests respective­
ly, which in fact are automatically merged due to the 
requirements of the second level gate for fault propaga­
tion, see Fig. 6.9b and 6.9c.
A corollary is now presented to theorem 6.5 for the casca- 
✓
-172-
ding of EXOR gates.
Corollary 6.1 : All single stuck-at failures in a circuit 
consisting of an EXOR gate whose output y is connected to 
an input x^ of a second gate can be detected at output of 
a second EXOR if, V t 6 T ®  are applied to both gates.
Proof : Consider a 2-input EXOR gate feeding another 
2-input EXOR gate, as shown in Fig. 6.10a. Using theorem 
6.4, Fig 6.10b details a set T = T ©  for the first gate 
and possible input combinations to the second gate to 
propagate detected faults. As stated before, a single bit 
change to an EXOR input combination will always change its 
output during fault free operation, so if c changes to c^ 
in the first gate, this level will always be propagated 
through the second. It is only necessary, therefore, to 
ensure that V t € T ©  are applied to each gate.
So far, only two cascaded gates have been considered. If, 
as is likely, more than two gates are cascaded, all single 
failures in every gate need to be propagated to the final 
output(s). To ensure that this happens, theorem 6.5 must 
now be applied to each pair of connected gates in the 
circuit and its principles extended until the final output 
is reached. Using Fig. 6.11a as an example, the procedure 
is as follows:
1) Start at gate A:
a) For V t 6 T& (gate A), determine appropriate 
t 6 T+ for gate B from theorem 6.5.
b) For the t € T+ required for gate B in a), 
determine appropriate t € T& for gate C from 
theorem 6.5.
2) Start at gate B:
a) For V t 6 T+ (gate B), determine appropriate 
t € T& for gate C from theorem 6.5. Some of 
these tests will have already been provided by 
la) .
3) Ensure that V t 6 T& are applied to gate C. Again,
-173-
some of these will have already been provided by lb) 
and 2a).
Fig. 6.11b gives the results of 1), 2) and 3) for the
circuit in Fig. 6.11a.
6.5 : DESIGN OF TSC CHECKERS USING KARNAUGH MAPS
A self checking checker maps a coded input (or inputs) to
a l-out-of-2 code output, as shown in Fig. 6.1a. Its
normal input set is Xc and its output set is Sc, where
S = £<01>,<10>}. From section 5.4.1, two fundamental c
rules must be applied to every TSC checker design. 
These are:
1) The checker must be code disjoint:
a) Code inputs are mapped to outputs <01> or <10>.
b) Non code inputs are mapped to outputs <00> or 
<11>.
2) Both code outputs must be used:
a) At least one code input must be mapped to output 
<01>.
b) At least one code input must be mapped to output 
<10>.
A checker must be totally self checking on the basis of 
codeword inputs only. If an independent subcircuit is 
used for each output, the checker will be fault secure for 
all faults affecting one subcircuit only. Under these 
fault conditions it will output the correct codeword or a 
non codeword. It is also.fault secure for faults creating 
a unidirectional error at its outputs. In general, only 
single stuck at faults are considered for analytical and 
design purposes. Independent subcircuits will be adopted 
for all designs, so only the self testing capabilites of 
each subcircuit need to be evaluated.
A design procedure is now presented for TSC checkers using
Karnaugh maps £6.13. From a Karnaugh map any circuit 
✓
-174-
design can be implemented using either a sum of products 
(normal minterm form) or a product of sums (normal maxterm 
form) C6.13. A sum of products implementation is a two 
level AND/OH structure as shown in Fig. 6.8a, whilst a 
product of sums implementation is a two level OR/AND 
structure as shown in Fig. 6.9a. Every design requires a 
Karnaugh map for each output. The size of the map is 
determined by the number of network inputs. Code inputs 
are depicted by a circle within the appropriate minterm 
square on each map.
6.5.1 : TSC AND/OR STRUCTURES
The following theorem ensures that an AND/OR structure is 
self testing for code inputs.
THEOREM 6.6 : A 2-level AND/OR structure is self testing 
for all single stuck at faults if code inputs provide:
i) At least one occurrence of a single 0 on each input 
of every AND gate with no other AND gate output at 1. 
ii) At least one occurrence of an all 1 input to each AND 
gate with no other AND gate output at 1.
Proof : The single 0 and all 1 requirements for each gate 
are the conditions of theorem 6.1 to detect all single 
stuck-at faults in an AND gate, whilst the occurrence of 
these with no other AND gate output at 1 satisfies the 
condition of theorem 6.5 to propagate a fault. Section
6.4 and Fig. 6.8 have already demonstrated that the single 
0 tests can be merged and that the requirements of theorem
6.5 for the second level OR gate will be automatically 
met.
A means of interpreting this theorem on a Karnaugh map is 
now required. Before presenting a corollary to theorem
6.6 for this purpose, a definition is required.
Definition 6.1 : The expansion of a prime implicant C6.13
-175-
with respect to one of its constituent variables is the
area on a Karnaugh map where that variable alone is at 0.
Fig. 6.12 shows these areas on a 4 variable Karnaugh map 
for each variable of a 2, 3 and 4 variable prime
implicant.
Corollary 6.2 : A 2-level AND/OR structure is self testing 
for all single stuck-at faults from code inputs if the
following conditions are satisfied on a Karnaugh map:
i) In the expansion of each prime implicant with respect 
to each constituent variable, there must be at least 
one codeword input which results in a 0 output; i.e. 
a ©  .
ii) Each prime implicant must contain at least one code­
word input which is unique to that prime implicant,
i.e. not covered by more than one prime implicant.
Proof : A codeword containing a 0 ( © )  in the expansion 
of a prime implicant with respect to a variable, results 
in this variable alone being a 0 for that code input. A
codeword containing a 1 ( © )  in the expansion of a prime 
implicant with respect to a variable, results in a 1 at 
the output of the OR gate, i.e. one of the other AND gates 
has a 1 output for that code input. Whilst this latter 
condition is allowed, there must be at least one instance 
where it does not occur (for codeword inputs), so the
expansion of the prime implicant must contain at least one 
© .  In order to provide a single 0 to each input of 
every AND gate, there must be at least one ©  in the 
expansion of every prime implicant with respect to all of 
its variables.
Every prime implicant must contain a codeword. If it does 
not, the AND gate it represents may receive all input 
combinations which have a single 0, but it will never 
receive the all 1 input combination to test this gate from 
code inputs. This all 1 input combination also provides a
-176-
single 1 test to the second level OR gate. A codeword 
which is covered by more than one prime implicant results 
in all these prime implicants (AND gates) presenting a 1 
to the second level OR gate for that code input. Whilst 
this condition is allowed, there must be at least one 
instance where it does not occur (for code inputs), so 
that each prime implicant must contain at least one code 
input which is not covered by any other prime implicant.
Fig 6.13 illustrates these points in a number of proposed 
designs for one output of a TSC checker, which are not 
self testing and therefore not in fact TSC for the reasons 
given in each case.
A prime implicant which contains an inverted variable does 
not create a problem as Fig. 6.14 indicates. The AND 
gates still receive all their required test inputs and 
since every AND gate input has to be 0 for some tests and 
1 for others, the added inverter gate will be automatical­
ly tested to the conditions given in theorem 6.3.
6.5.2 : TSC OR/AND STRUCTURES
The following theorem ensures that an OR/AND structure is 
self testing from code inputs.
THEOREM 6.7 : A 2-level OR/AND structure is self testing 
for all single stuck at faults if code inputs provide:
i) At least one occurrence of a single 1 on each input of 
every OR gate with no other OR gate output at 0.
ii) At least one occurrence of an all 1 input to each OR 
gate with no other OR gate output at 0.
The proof is similar to that for theorem 6.6 and is there­
fore not included. A circuit design can be implemented 
just as easily in an OR/AND form from a Karnaugh map as 
the AND/OR form C6.13. Prime implicants now become prime 
implicates, covering 0's instead of l's. Definition 6.1
-177-
now needs to be modified.
Definition 6.2 : The expansion of a prime implicate with 
respect to one of its constituent variables is the area on
a Karnaugh map where that variable alone is at 1.
Fig. 6.15 shows these areas on a 4 variable Karnaugh map 
for each variable of a 2, 3 and 4 variable prime 
implicate.
A similar corollary to that for theorem 6.6 can now be
presented.
Corollary 6.3 : A 2-level OR/AND structure is self testing 
for all single stuck-at faults from code inputs if the
following conditions are satisfied on a Karnaugh map:
i) In the expansion of each prime implicate with respect 
to each constituent variable, there must be at least 
one codeword input which results in a 1 output; i.e.
a (T) •
ii) Each prime implicate must contain at least one code­
word input which is unique to that prime implicate; 
i.e. not covered by more than one prime implicate.
Again the proof is not included, as it is so similar to 
that for corollary 6.2
TSC circuits and networks can now be designed using the 
above theorems and corollaries.
6.6 : A TSC N-BIT COMPARATOR
A TSC circuit is frequently required to compare two n-bit 
words. Consider, initially, that each input and its 
corresponding input in the other word are encoded in a
l-out-of-2 code, in the manner of Fig. 6.16. This circuit 
is then an n x l-out-of-2 to 1 x l-out-of-2 code conver­
ter .
-178-
Consider two pairs of inputs <a^b^> and r so that
there are four possible code inputs <a^b^a^b^>; <0101>,
<0110>, 1001> or <1010>. Each of these must be mapped to 
one of the code outputs <01> or <10>, with each output 
occurring at least once. Non code inputs are mapped to 
the non code outputs <11> or <00> for the circuit to be 
code disjoint. In addition, the circuit must be self 
testing from only the four code inputs.
Circuit options for each output are:
1) A single AND (OR) gate with up to three inputs, since 
the number of tests for a 3-input gate is four.
2) A single EXOR gate, since a minimum of two or three 
tests are required for an n-input gate, where n is odd 
or even respectively.
3) A 2-level AND/OR (or OR/AND) structure wth up to 2 x 
2-input AND gates, since the number of tests when 
fully merged is four.
A single AND with two or three inputs is represented on a 
Karnaugh map by a block of two or four l's respectively. 
Both of these prime implicants can only cover a single 
codeword, so option 1) is eliminated. Example 6.2 
indicates that option 2) can provide a design that is self 
testing but not code disjoint, whereas example 6.3 shows 
that option 3) is the most viable.
Example 6.2 : The EXOR/EXNOR combination in Fig. 6.17a 
provides the desired function. Its truth table in Fig. 
6.17b demonstrates that it is self testing for code 
inputs, but also shows that it is not code disjoint.
Example 6.3 : Option 2) requires a block of four l's on a 
4 variable Karnaugh map for each 2-input AND gate. The 
design of Figs. 6.18a and 6.18b is a 2-level AND/OR 
structure which is self testing from corollary 6.2. Its 
output <3^ 2> *s given by:
-179-
yx = + bxa2
and y2 = + b2b2 (6.3)
The structure of each subcircuit is identical, so Fig. 
6.18c details the testing requirements for one of them. 
This process has already been demonstrated in example 6.1. 
It shows that the inputs to each subcircuit must satisfy 
six conditions to fully test for all single stuck-at 
faults and that this number may be reduced to four in a 
fully merged test set. However, there are only four code­
word inputs, so each subcircuit MUST have a fully merged 
test set, with each codeword generating one of the four 
tests. The truth table of the circuit in Fig. 6.18d 
demonstrates that it is code disjoint and that it uses a 
fully merged test set for self testing purposes. Fig. 
6.18e gives the fault table for the subcircuit which 
computes y^, which confirms that all code inputs are 
required to test it. It is interesting to note that the
second level OR gates provide a l-out-of-4 to l-out-of-2
code conversion. This process* is not, however, code 
disjoint. The code disjoint property of the overall 
circuit is provided by the first level AND gates.
Example 6.4 : In a similar manner to example 6.3, the 
design of Figs. 6.19a and b is a 2-level OR/AND structure 
which is self testing from corollary 6.3. Its output 
< y ^ 2> is given by:
yl = (al + a2)(bl + V
and y2 = (a1 + b2>(b1 + a2> (6.4)
Figs. 6.19c and d demonstrate that it is code disjoint and 
uses a fully merged test set. The second level AND gates 
in this instance provide a 3-out-of-4 to l-out-2 code 
conversion which is not code disjoint.
The circuits of examples 6.3 and 6.4 are those presented 
by Carter C6.2D and Anderson C6.33. They are now expanded
-180-
for more than two input pairs and henceforth referred to 
as 2-input Morphic AND gates.
A Morphic AND gate is a conventional AND gate where each 
input and output is replaced by a pair of inputs and a 
pair of outputs respectively, with each pair encoded in a
1-out-of-2 code. The output is a codeword if all input 
pairs are codewords. Thus the output pair <y^y^> a
2-input Morphic AND gate is given by:
<yly2> = <aibi> &m <a2b2> (6*5>
where is defined as the Morphic AND operation. An 
n-input Morphic AND gate has n input pairs, i.e. 2n 
physical inputs. Fig. 6.20 compares the truth tables for 
conventional and Morphic 2-input AND gates. The truth 
table of the latter confirms that examples 6.3 and 6.4 
are 2-input Morphic AND gates.
If y^^ and denote the outputs of an n x l-out-of-2 to 
1 x l-out-of-2 converter, or n*-input Morphic AND gate, 
then for n=l:
Using equations (6.4) and (6.6), the outputs for n=2 are 
given by:
A 3-input Morphic AND gate, illustrated in Fig. 6.21a, 
consists of two cascaded 2-input Morphic AND gates C6.43, 
so that;
* n  - ai 
*12 = bl (6.6)
*21 = alb2 + bla2 * *llb2 + *12a2 
*22 = ala2 + blb2 = *lla2 + *21b2 (6.7)
-181-
(6.8)
Equation (6.8) indicates that the 3-input Morphic AND gate 
can also be implemented as the 2-level AND/OR structure of 
Fig. 6.22a. However, it needs to be established if this 
circuit or that in Fig. 6.21a is totally self checking. 
Each 2-input Morphic AND gate in Fig. 6.21a is TSC 
(inherently fault secure and self testing from theorem 
6.6), code disjoint, fully exercised (the second level 
Morphic AND gate also receives all input codewords, see 
the truth table in Fig. 6.21b) and there is no reconver- 
gent fanout in the network. Thus the conditions of 
corollary 5.1 are satisfied and the network is TSC. 
Applying corollary 6.2 to the Karnaugh map representation 
of equation (6.8) in Fig. 6.22b reveals that the 2-level 
AND/OR structure is also self testing and thus TSC. The 
truth table for output y ^  in Fig. 6.22c also shows that 
this implementation requires all codeword inputs to fully 
test it, but does not use a fully merged test set (seven 
tests).
From Fig. 6.23a a 4-input Morphic AND gate is given by:
Equation (6.9) can again either be implemented as in Fig. 
6.23a, or as the 2-level AND/OR structure shown in Fig. 
6.23b. Both are TSC.
In general, from equations (6.7) to (6.9), an n-input 
Morphic AND gate is given by:
Equation (6.10) is that presented by Carter C6.53 and 
Anderson C6.33.
*41 = y31b4 + y32a4 
y42 = y31a4 + y32b4 (6.9)
ynl ” y (n-l)lbn + y (n-l)2an 
yn2 = y (n-1)lan + y (n-l)2bn (6.10)
-182-
Equations (6.7) to (6.9) also show that each output has
2n  ^prime implicants, each with n variables. This means
n "■ 1that both subcircuits require 2 x n-input AND gates and 
n _2
a 2 -input OR gate. The n-input Morphic AND gate there­
fore satisfies the conditions associated with equation 
(6.2), which gives the number of fully merged tests as
n_ j
Tm = 2 +n. The number of possible codeword inputs to an
n-input Morphic AND gate is 2n , since each input has two 
code combinations (<01> and <10>). Equating the number of 
codewords to Tm reveals that only a 2-input Morphic AND 
gate uses a fully merged test set.
Consider now a 4-input Morphic AND gate formed from 2 x 
2-input Morphic AND gates, both implemented as an AND/OR 
structure, which feed a third 2-input Morphic AND gate 
implemented as an OR/AND structure, see Fig. 6.24a. Each
2-input Morphic AND gate is TSC, so the use of mixed 
structures in no way affects the self checking ablility of 
the network. In Fig. 6.24a, there is a set of 2 x
2-input OR gates feeding a third 2-input OR. These gates 
can be merged to give a set of 4-input OR gates and the 
circuit of Fig. 6.24b. The second level of the AND/OR 
structure has now been merged with the first level of the 
OR/AND structure, which results in fewer gates and gate 
levels. This merging process does not affect the self 
checking ability of the network either, because the 
minimum test set for the cascaded 2-input OR gates (which 
must exist in the original network) is identical to that 
for the single 4-input OR gate of Fig. 6.25. This set of 
OR gates forms a non code disjoint 2 x l-out-of-4 to 1 x
3-out-of-4 code converter.
If the 4-input Morphic AND gate is formed from 2 x 2-input 
OR/AND Morphic AND gates, which feed a 2-input AND/OR 
Morphic AND gate, then a similar merging process can be 
performed on the central AND gates, as shown in Fig. 6.26. 
This set of AND gates form a non code disjoint 2 x 3-out- 
of-4 to 1 x l-out-of-4 code converter.
-183-
A Morphic AND gate for a large number of inputs can use 
both of these merging processes. This significantly 
reduces the number of gates and gate levels when compared 
to the cascading of original 2-input Morphic AND gates. 
Fig. 6.27 shows both of these networks for n=8. The third 
option, a 2-level AND/OR structure, would require a total 
of 256 x 8-input AND gates and 2 x 128-input OR gates.
All the circuits in this section so far have dealt with 
input pairs encoded in a l-out-of-2 code. The original 
intention was to design a TSC n-bit comparator or equality 
checker. In this case the input pair is effectively en­
coded in the duplication code, i.e. both inputs identical, 
as opposed to the l-out-of-2 code in which they are always 
opposite. It is therefore a simple matter to convert all 
of the above designs into equality checkers by inserting 
an inverter gate into one input of every pair, as shown in 
Fig. 6.28. These extra gates convert the duplication code 
into a l-out-of-2 code. They will not affect the self 
checking ability of each design and will be automatically 
tested to meet the conditions of theorem 6.3 from 
codewords (every inverter input receives a 1 and a 0 from 
codewords).
6.7 : A TSC PARITY CODE CHECKER
A TSC parity checker is a circuit which converts a parity 
code to a l-out-of-2 code. This is achieved by splitting 
the bits of a parity encoded word into two groups, where 
each group contains at least one bit, as follows:
1) Odd Parity : An odd number of l's in the codeword 
creates an odd number of l's in one group and an 
even number of l's in the other. Therefore, if each 
group is checked for an odd number of l's using an 
EXOR tree, one output will be high and the other 
low; i.e. a l-out-of-2 code. Fig. 6.29a shows the 
resulting odd parity checker.
-184-
2) Even Parity : An even number of l's in the codeword 
creates an odd number of l's in both groups or an 
even number of l's in both groups. Therefore, if one 
group is checked for an odd number of l's using an 
EXOR tree and the other group is checked for an even 
number of l's, then a l-out-of-2 code output will be 
produced. Fig 6.29b shows the resulting even parity 
checker.
An independent subcircuit for each of the two outputs 
ensures that the checker is fault secure, so, again, only 
its self testing ability needs to be evaluated.
Example 6.5 : Consider an 8-bit data word with a single
odd parity bit. Fig. 6.30a shows the 9-bit checking
circuit where bits x^, x2, x^ and x^ form group 1 and bits
X5, x6, x?, Xg and Xg form group 2, such that;
yl = xi ©  x2 ©  x3 ©  x4
y2 = x5 ©  x& ©  x? ©  Xg ©  x9 (6.11)
The data word is assumed to take on all possible combina-
o
tions (2 ) so that the group 1 inputs will do likewise.
g
The overall codeword, however, only takes on 2 out of the
9 82 possible combinations. (The other 2 combinations are
even parity codewords.) The group 2 inputs, which include
the parity bit, will take on all combinations, though,
since the group 1 inputs are always able to provide
appropriate levels to form an odd parity codeword. The
truth tables are thus given in Fig. 6.30b and 6.30c for
each subcircuit with all input combinations. They
indicate that each gate will receive all its input
combinations, so the conditions of corollary 6.1 are
satisfied. Thus each subcircuit is self checking and the
checker is TSC overall.
Example 6.5 demonstrates that a TSC checker for a k-bit 
parity encoded word can be constructed from a p-input 
EXOR tree and a q-input EXOR tree, such that:
-185-
p + q = k, where p and q >, 1 (6.12)
In example 6.5 k=9, p=4 and q=5. The parity checker used 
as part of the partially self checking circuit in Fig. 
5.21 is another example, where p=l and q = k-1.
6.8 : A TSC 1-0UT-0F-N CODE CHECKER
A particular case of m-out-of-n codes are 1-out-of-n 
codes. They occur frequently in control signals where the 
code is generated from an m to n line decoder of the form 
of Fig. 6.31. An example of this is address decoding in a 
microprocessor based system. A TSC means of checking a 
1-out-of-n code is therefore required. The author has 
investigated many 2-level AND/OR structures for this 
purpose, but all fail to be self testing because of only 
n codewords in 2n combinations and the fact that each 
codeword has only a single 1. A more complex circuit must 
therefore be necessary.
Anderson C6.63 and Marouf C6.73 have proposed that a 
1-out-of-n code is converted to a k-out-of-2k code, which 
is then checked by a TSC k-out-of-2k to l-out-of-2 conver­
ter, as shown in Fig. 6.32a. Kraft C6.83 and Khakbaz C6.93 
have proposed that a 1-out-of-n code is converted to a 
k x l-out-of-2 code, which is then checked using the 
circuit described in section 6.6 and shown in Fig. 6.32b. 
It is this latter approach which is adopted here, as 
follows.
The 1-out-of-n to k x l-out-of-2 converter, circuit L in 
Fig. 6.32b, has n inputs, x^...x^...x and k output pairs 
<c^d^>...<Cjdj>...<0 4^^), where k is given by:
A input codeword with x^=l maps to an output combination 
such that <c^...cj...c^) is the binary representation of
-186-
i. In each case d. is the inverse of c,. The only
3 3 y
exception to this is where n is a power of 2, in which
case the codeword with x =1 maps to c, =. . . =c . =. . . =c. =0.n 1 j k
Each output Cj and d^ is computed using an independent
subcircuit,, so the network is inherently fault secure. 
The network must also be code disjoint and self testing. 
These properties are demonstrated in example 6.6 below.
The k x l-out-of-2 to 1 x l-out-of-2 converter, circuit C 
in Fig. 6.32b, consists of a series of cascaded 2-input 
Morphic AND gates with level merging, if appropriate, 
as described in section 6.6. If n is not a power of 2, 
then all possible codewords will not be generated at the 
outputs of circuit L, since L has n code inputs and n code 
outputs. Circuit C implemented as a 2-level AND/OR (or
lr
OR/AND structure) will require all 2 codewords to be 
self testing, so this design is not viable. However, 
every 2-input Morphic AND gate also requires all four 
codewords to be self testing, so the connections from 
circuit L to circuit C will have to ensure that this 
occurs. Example 6.6 illustrates' this point.
Example 6.6 : Consider the 2 to 4 line decoder shown in 
Fig. 6.33. This is a similar circuit to that used in 
example 5.6, except that it has an additional input, 
enable g. When g=0 the decoder operates normally, but 
when g=l all outputs are at 0. All outputs at 0 is not 
a l-out-of-4 codeword, but if g is included as part of the 
output, a l-out-of-5 code will always be generated.
Circuit L (of Fig 6.32b) will have five inputs, so that 
n=5, and k output pairs, where k=3 from (6.13). Fig. 
6.34a gives the code inputs and required code outputs for
this circuit, based on the procedure given above. Fig. 
6.34b shows the Karnaugh maps for each output pair <c^dj>, 
with noncode inputs assigned to give a <00> or <11> 
output. From these:
C. = X c +  X.1 5 4 , d1 = x3 + x2 + xx
-187-
c3 - x5 + ^  + x x
c2 = x3 + x2 d2 = x5 + x4 + x -l  
d3 = x4 + x2 (6.14)
Applying corollary 6.2 to each Karnaugh map demonstrates 
that all subcircuits are self testing, so that L is TSC.
Circuit C (of Fig 6.32b) will be a 3-input Morphic AND
gate consisting of two cascaded 2-input Morphic AND gates,
as shown in Fig. 6.35a. Fig. 6.34a shows that only five
of the possible eight output codewords are generated from
circuit L, so that a Morphic AND gate with inputs <c^d^>
and <c7d«> will not be self testing, since it will not 
•¥receive <1010>as a codeword. The pairing of < C 2 & 2 ^ and 
(c^d^), however, does generate all four codewords, so 
these must therefore be connected to the first level 
Morphic AND gate in C and <c^d^> connected to the second 
level Morphic AND gate in C, as shown in Fig. 6.35a. 
Hence:
Fig. 6.35b demonstrates that the second level Morphic AND 
gate in C receives all codewords, so C is self testing and 
the overall checker TSC. If the 3-input Morphic AND gate 
required for C was implemented as a 2-level AND/OR 
structure (equation 6.8), then it would not be self 
testing, as section 6.6 has already shown that it requires 
all input codewords for this purpose.
Anderson concluded from his design procedure C6.63 that a 
TSC code checker could not be constructed for l-out-of-3 
or l-out-of-7 codes. He observed that a l-out-of-7 code 
had too many codewords to be converted to a 2-out-of-4 
code and if converted to a 3-out-of-6 code did not provide 
enough codewords to make the 3-out-of-6 checker self 
testing. Reddy C6.103 made the latter option possible by 
designing a 3-out-of-6 checker which required less code­
*<e,<J,c2J2> -188-
(6.15)
words to be self testing. A l-out-of-3 code checker cannot 
be designed using the method presented here either, since 
it has insufficient codewords to test even a 2-input gate. 
David C6.11U has overcome this problem by converting a 
l-out-of-3 code to a l-out-of-4 code using a TSC sequen- 
ial circuit and checking that.
Another 1-out-of-n code checker is now proposed, similar
to that previously described in example 5.6. The network
in Fig. 5.26, which is self testing only, can be TSC by 
replacing the two OR gates in the checker with EXOR gates.
Example 6.7 : Consider again the 2 to 4 line decoder with 
enable as in Fig. 6.33. Fig. 5.27 has detailed the effect 
of every single stuck-at fault within the circuit of Fig. 
5.22. Fig. 6.36 uses this information to summarise the
effects of every single stuck-at fault within the circuit 
of Fig. 6.33. It indicates that these faults can cause 
either an all 0 output or an output with two l's, but
never an incorrect code output.
Two EXOR gates are added to the circuit of Fig. 6.33 to
form an output <e^e2>, such that:
ea = <3 ©  y4 ©  y3 
and ei = y2 © yi’ (6.16)
Fig 6.37 details the complete truth table for these 
additional gates. It demonstrates that the EXOR checker 
is self testing for code inputs, but is not code disjoint. 
All input combinations to it which have an odd number of 
l's produce a code output. This is not surprising since 
the circuit is the parity checker of section 6.7. How­
ever, on the basis of single stuck-at faults, a decoder
output combination with three or five l's cannot occur.
The checker has an independent gate for each output, so it 
is effectively TSC. Note that the same is not true for 
unidirectional faults.
-189-
6.9: A TSC PERIODIC SIGNAL CHECKER
So far, only checkers for signals encoded in an error 
detecting code have been examined. There are also other 
signals in a microprocessor based system which need to be 
checked, notably single or multiphase clocks. The self 
checking synchronous machine design of section 5.8 also 
requires a checked clock signal.
Moreira de Souza et al in their design of a research 
oriented microcomputer C6.123 used a fault tolerant clock 
C6.133, derived from the work of Daly C6.143. For the 
Electronic Switching System processor, Chang et al C6.153 
detected variations in clock period by converting the 
signals to DC levels through low pass RC circuits and 
comparing these levels with known references. They 
checked for phase overlap with logic which could be tested 
via maintenance control inputs, as shown in Fig. 6.38. 
Usas C6.163 has proposed a TSC checker for single phase 
periodic signals and this is described more rigorously 
here.
A single phase clock signal is not encoded in space but
encoded in time via three parameters; t (period), tQn
and t (mark-space ratio), where t = t  + t ^ . Aoff p on off
clock signal must be checked for these parameters, illus­
trated in Fig. 6.39, as well as stuck-at faults. This is 
accomplished with the circuit of Fig. 6.40. It consists 
of two monostables Ml and M2. Ml is triggered by the 
rising edge of the clock and should be set to generate a 
high level pulse of expected width t , whereas M2 is 
triggered by the falling edge of the clock and set to 
generate a pulse of width The two monostable
outputs form a constantly changing l-out-of-2 code during 
normal operation, as shown in Fig. 6.41b.
The parameters t , t and t ^  can each remain constant,p on of f ^
increase or decrease, so there are 3 -1 = 26 faulty 
combinations or non codewords. Some of these are
-190-
impossible faults such as t constant with increased
P
t _ and t Fig. 6.41 details the possible faulton of i
combinations and their effects on the monostable outputs. 
The clock signal SA1 or SAO is equivalent to an infinite 
increase in fcoff or t respectively, i.e. special cases 
of faults 3 and 5 in Fig 6.41e and 6.41g respectively. 
Fig 6.41 demonstrates that the checker produces a non code 
output for all possible fault combinations, so it is code 
disjoint. However, a permanent fault indication only 
occurs for stuck-at faults.
Each output uses an independent monostable, so the checker 
is fault secure for all faults affecting a single mono- 
stable, as well as faults resulting in a unidirectional 
output error. The self testing property cannot be 
determined without reference to the circuit used for the 
monostable. For example, a fault in Ml that causes its 
input to be fed directly to its output will not be 
detected and it would then prevent the clock SA1 failure 
from being detected. Usas describes appropriate mono­
stables to ensure that the checker is self testing and 
hence TSC. He also details several applications for the 
checker.
More complex periodic waveforms, including multiphase 
clocks, can be generated by a self checking sequential ma­
chine, or by logic with self checking checkers monitoring 
the resultant waveform(s) for perhaps stuck-at failures or 
phase overlap. The input to both of these circuits should 
be a single phase clock checked in the manner described 
above.
6.10 : LITERATURE REVIEW
Carter et al C6.23 proposed the 2-input Morphic AND gate 
and its extension for n-inputs. Anderson C6.33 developed 
this theory and showed how the number of gate levels could 
be reduced when cascading 2-input Morphic AND gates. In 
addition, he discussed m-out-of-n checkers, presenting a
-191-
TSC 3-out-of-6 checker, which he used again for 1-out-of-n 
checkers C6.63.
There has been and continues to be a considerable amount 
written on the design of m-out-of-n checkers, either 
specifically £6.7,6.10,6.17-6.213, or as part of a wider 
discussion of TSC circuits C6.22,6.233. These papers have 
generally resulted from researchers striving to reduce the 
hardware required for such checkers (the number of gates, 
gate inputs and gate levels), in order to achieve an 
increased speed of operation, reduced costs and fewer 
tests. Although 1-out-of-n codes are a subset of m-out- 
of-n codes, checkers specifically for them have also been 
considered C6.8,6.93. A TSC checker for a l-out-of-3 code 
has not yet been designed using combinational logic, 
although David C6.113 has proposed a TSC sequential 
machine for this purpose.
There has been little added to the work of Carter and 
Anderson for TSC n-bit comparators using Morphic AND gates 
(also known as l-out-of-2 checkers), other than their 
implementation in specific technologies. These compara- 
ors usually compare two n-bit vectors, but Hughes C6.24, 
6.253 has extended the principles involved to comparators 
for more than two input vectors.
There has also been no alternative proposals to the EXOR 
trees for TSC parity checkers C6.2,6.26,6.473. What is 
discussed though is their application. A number of 
authors have studied TSC checkers for combinational 
circuits which have uncoded inputs or outputs C6.27-6.303 
and this often involves the use of parity predicion.
In addition to the codes mentioned so far, TSC checkers 
have also been proposed for Berger codes £6.31,6.323, 
separable codes in general £6.33,6.343 and low cost 
arithmetic codes £6.353. The construction of all these 
checkers using programmable logic arrays has been widely 
investigated £6.36-6.383, as well as their implementation
-192-
in various technologies C6.39-6.433.
Other designs include the TSC periodic signal checker 
described above C6.163, a self checking linear feedback 
register C6.443, a self testing only decoder circuit 
C6.453 which was redesigned to be TSC in a later paper 
C6.463, a TSC linear counter C6.473 and a self testing 
arbiter circuit for multi-microcomputer systems C6.483.
-193-
6.11 : REFERENCES
6.1) Logical design of switching systems - D. Lewin; 
Thomas Nelson and Sons Ltd, England; Second 
edition, 1974; Chapter 3.
6.2) Design of dynamically checked computers. - W. C. 
Carter, P. R. Schneider; Proc. IFIPS Congress, 
Vol. 2; Edinburgh; August 1968; North-Holland 
Publishing Company, Amsterdam (1969); pp. 878-883.
6.3) Design of self-checking digital networks using 
coding techniques - D. A. Anderson; Coordinated 
Science Laboratory Report R-527; September, 1971; 
University of Illinois, Urbana, Illinois, USA.
6.4) Private communication with W. C. Carter; May, 1983.
6.5) A theory of design of fault-tolerant computers 
using standby sparing - W. C. Carter, W. G. Bouri- 
cius, D. C. Jessep, J. P. Roth, P. R. Schneider, A. 
B. Wadia; FTCS-1*; pp. 83-86.
6.6) Design of totally self-checking check circuits for 
m-out-of-n codes; D. A. Anderson, G. Metz; IEEE 
Trans. Comput.; Vol. C-22, No. 3; March, 1973; pp. 
263-269.
6.7) Efficient design of self-checking checkers for 
m-out-of-n codes - M. A. Marouf, A. D. Friedman; 
FTCS-7*: pp. 143-149.
6.8) Microprogrammed control and reliable design of 
small computers - G. D. Kraft, W. N. Toy; Prentice- 
Hall Inc., Englewood Cliffs,Jersey, USA; 1981; pp. 
238-264.
6.9) Totally self-checking checker for 1-out-of-n code 
using two-codes - J. Khakbaz; IEEE Trans. Comput.; 
Vol. C-31, No. 7; July, 1982; pp. 677-681.
6.10) A note on self-checking checkers - S. M. Reddy; 
IEEE Trans. Comput.; October, 1974; pp. 1100-1102.
6.11) A totally self-checking l-out-of-3 checker - Rene 
David; IEEE Trans. Comput., Vol. C-27, No. 6; June, 
1978; pp. 570-572.
6.12) A research oriented microcomputer with built in 
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
6.13) Fault-tolerant digital clocking system - J. Moreira 
de Souza, E. Peixoto Paz; Electronics Letters; Vol. 
11, No. 18; September 4, 1975; pp. 433-434.
6.14) A fault tolerant digital clocking system - W. M. 
Daly, ,A. L. Hopkins, J. F. McKenna; FTCS-3*; pp.
-194-
17-21.
6.15) Maintenance techniques of a microprogrammed self- 
checking control complex of an electronic switching 
system - H. Y. P. Chang, G. W. Heimbigner, D. J. 
Senese, T. L. Smith; IEEE Trans. Comput., Vol. 
C-22, No. 5; May, 1973; pp. 501-512.
6.16) A totally self-checking checker design for the 
detection of errors in periodic signals - A. M. 
Usas; IEEE Trans. Comput.; Vol. C-24, No.5; May, 
1975; pp. 483-489.
6.17) A new design method for m-out-of-n TSC checkers
- N. Gaitanis, C. Halatsis; IEEE Trans. Comput.; 
Vol. C-32, No. 3; March 1983; pp. 273-283.
6.18) Fast and efficient totally self-checking checkers 
for m-out-of (2m+/-l) codes - C. Halatsis, N. 
Gaitanis, M. Sigala; IEEE Trans. Comput.; Vol. 
C-32, No. 5; May, 1983; pp. 507-511.
6.19) Modular realization of totally self checking 
checkers for m-out-of-n codes - C. Efstathiou, C. 
Halatsis; FTCS-13*; pp. 154-161.
6.20) Design method of totally self checking checkers for 
m-out-of-n codes; S. Piestrak;FTCS-13*;pp. 162-168.
6.21) A 3-level realization of totally self-checking 
checkers for m-out-of-n codes - T. Nanya, Y. Tohma; 
FTCS-13*; pp. 173-176.
6.22) A theory of design of fault-tolerant computers 
using standby sparing - W. C. Carter, W. G. Bouri- 
cius, D. C. Jessep, J. P. Roth, P. R. Schneider, A. 
B. Wadia; IBM Research Yorktown; RC 3261; February, 
1971; pp. 1-11.
6.23) Diagnosis and reliable design of digital systems
- M. A. Breuer, A. D. Friedman; Pitman Publishing 
Ltd., London, 1977; pp. 265-272.
6.24) Design of totally self-checking comparators with 
an arbitrary number of inputs - J. L. A. Hughes, 
E. J. McCluskey, D. J. Lu; IEEE Trans. Comput.; 
Vol. C-33, No. 6; June, 1984; pp. 546-550.
6.25) Design of totally self-checking comparators with an 
arbitrary number of inputs - J. L. A. Hughes, E. J. 
McCluskey, D. J. Lu; FTCS-13*; pp.169-172.
6.26) Self-testing embedded parity trees - J. Khazbaz; 
FTCS-12*; pp. 109-116.
6.27) A self-testing group-parity prediction checker and 
its use for built-in testing - E. Fujiwara, N. 
Mutoh, K. Matsuoka; IEEE Trans. Comput.; Vol. C-33,
-195-
No.6, June, 1984; pp. 578-583.
6.28) A totally self-checking generalized prediction 
checker and its use for built-in testing - E. 
Fujiwara, K. Matsuoka; FTCS-15*; pp. 384-389.
6.29) The design of totally self-checking embedded 
checkers - N. K. Jha, J. A. Abraham; FTCS-14*; pp. 
265-270.
6.30) A self-testing group-parity prediction checker and 
its use for built-in testing - E. Fujiwara; FTCS- 
13*; pp. 146-153.
6.31) Design of fast self-testing checkers for a class of 
Berger codes - S. J. Piestrak;FTCS-15*;pp. 418-423.
6.32) Design of self-checking checkers for Berger codes
- M. A. Marouf, A. D. Friedman;FTCS-8*;pp. 179-184.
6.33) On totally-self-checking checkers for separable
codes - M. J. Ashjaee, S. M. Reddy; FTCS-7*; pp. 
151-156.
6.34) On totally-self-checking checkers for separable
codes - M. J. Ashjaee, S. M. Reddy; IEEE Trans. 
Comput.; Vol C-26, pp. 737-744.
6.35) Totally self checking checkers for low cost arith­
metic codes - N. Gaitanis; FTCS-14*; pp. 260-264.
6.36) Design and application of self-testing comparators 
implemented with MOS PLA's - Y. Tamir, C. H.
Sequin; IEEE Trans. Comput.; Vol. C-33, No. 6;
June, 1984; pp. 493-506.
6.37) PLA implementation of k-out-of-n code TSC checker
- B. Bose, D. J. Lin; IEEE Trans. Comput.; Vol. C- 
33, No. 6; June, 1984; pp. 583-588.
6.38) The design of totally self-checking circuits using 
programmable logic arrays - S. L. Wang, A. Avizien- 
is; FTCS-9*; pp. 173-180.
2
6.39) Multivalued I L circuits for TSC checkers - D. 
Etiemble; FTCS-9*; pp. 181-184.
6.40) Threshold I^L totally self checking circuits - Y. 
W. Yang, S. M. Liu, T. C. Chen; FTCS-10*; pp. 284- 
287.
6.41) Analysis of a class of totally self-checking 
functions implemented in a MOS LSI general logic 
structure - M. W. Sievers, A. Avizienis; FTCS-11*; 
pp. 256-261.
6.42) Totally self-checking CMOS circuits using a hybrid 
relalization - N. K. Jha, J. A. Abraham; FTCS-15*;
-196-
pp. 154-158.
6.43) Techniques for efficient MOS implementation of 
totally self-checking checkers - N. K. Jha, J. A. 
Abraham; FTCS-15*; pp. 430-435.
6.44) Self-checking linear feedback shift registers - D. 
J. Lu; FTCS-10*; pp. 269-271.
6.45) A simple self-testing decoder checking circuit
- W. C. Carter, K. A. Duke, D.C. Jessep; IEEE 
Trans. Comput.; November, 1971; pp. 1413-1415.
6.46) Theory and use of checking circuits - W. C. Carter; 
Infotech*; pp. 415-454.
6.47) Error-correcting codes and self-checking circuits
- D. K. Pradhan, J. J. Stiffler; Computer; March, 
1980; pp. 27-37.
6.48) A self-testing arbiter circuit for multimicrocom­
puter systems - M. Courvoisier, J. C. Geffroy, J. 
P. Seek; FTCS-10*; pp. 281-283.




























FIGURE 6.1 SELF CHECKING CIRCUITS
- = no fault detected 
1 = SA1 fault detected 
0 = SAO fault detected










1 - 1 
1 1 
0 0 0
FIGURE 6.2 FAULT TABLE FOR 2-INPUT AND GATE










1 1 1  
0 0 
0 - 0





a b e d
0 0 0 
0 0 1 
0 1 0 
0 1 1 
1 0 0 
1 0 1 
1 1 0 
1 1 1
1 - - 1
1 - 1 
1 1 





a b e d
0 0 0 
0 0 1 
0 1 0  
O i l  
1 0 0 
1 0 1 
1 1 0  
1 1 1
1 1 1 1  
0 0 
0 - 0  
0































a b e d
STUCK-AT FAULTS 
DETECTED 
a b c d e
DETECTED 
b c
0 0 1 1 1 0 0 0 I 1 1 1 0 0 0 0 1 1 1 1 1
0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0
1 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 1 1 0 1 0
1 1 0 0 1 0 1 1 1 0 0 1 0 0 1 1 1 1 0 0 1
1 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0
1 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1
1 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1
1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0
1 0 0 0 0 1 1 1 0
1 0 0 1 0 1 1 0 1
1 0 1 0 0 1 0 1 1
1 0 1 1 0 1 0 0 0
1 1 0 0 0 0 1 1 1
- 1 1 0 1 0 0 1 0 0
1 1 1 0 0 0 0 1 0
1 1 1 1 0 0 0 0 1
FIGURE 6.6 FAULT TABLES FOR EXCLUSIVE-OR GATES
(a)
AND INPUTS
FAULTS d df EQUIVALENT OR INPUTS TOV t € T& DETECTED (FAULT (d WITH FAULT IN PROPAGATE FAULTa b c FREE) FAULT) f e f 9
0 1 1 a,d SA1 0 1 SA1 0 0 0
1 0 1 b,d SA1 0 1 SA1 0 0 0
1 1 0 c,d SA1 0 1 SA1 0 0 0
1 1 1 a,b,c ,d SAO 1 0 SAO 0 1 0
(b) ADDITIONAL OR INPUTS ( 1 0 0
FOR V t € T-*- ^ 0 0 1












FIRST LEVEL J SECOND LEVEL
(a) CIRCUIT
rESTS FOR FIRST LEVEL CATES SECOND LEVEL INPUTS 
TO PROPAGATE FAULT
a 9 c d
©
e f g 9 i re n o
r 0 1 1 X X 0 0 0
1 0 1 X X 0 0 0V t 6 T& < 1 1 0 X X 0 0 0
I 1 1 1 X X 1 0 0
X 0 1 1 X 0 0 0
X - OUTPUT X 1 0 1 X 0 0 0
AT 0 X 1 1 0 X 0 0 0
X 1 1 1 X 0 1 0
X X 0 1 1 0 0 0
12 CONDITIONS X X l 0 1 0 0 0
TO SATISFY X X l 1 0 0 0 0
X X l 1 1 0 0 1
(b) REQUIRED TESTS (12) V t € T+
X - OUTPUT 
AT 0
6 TESTS IN 
FULLY MERCED 
TEST SETS
a b c d e f 9 h 1 m n o
0 1 1 0 1 1 0 1 1 0 0 0
1 0 1 1 0 1 1 0 1 0 0 0
1 I 0 1 1 0 1 1 0 0 0 0
1 1 1 X X 1 0 0
X 1 1 1 X 0 1 0
X X 1 1 1 0 0 1
a b c d e f g h 1 m n o
0 1 1 1 1 0 l 0 1 0 0 0
1 1 0 1 0 1 0 1 1 0 0 0
1 1 0 1 0 1 0 1 1 0 0 0
1 1 1 X X 1 0 0
X 1 1 1 X 0 1 0
X X l 1 1 0 0 1
(C) POSSIBLE MERCED TEST SETS
FIGURE 6.8 TESTING A 2-LEVEL AND/OR STRUCTURE
-200-
V t € T+
© <D © © © © © ©
100 X X 111 Oil X X Oil
010 X X 111 \J ¥• C. TC / 101 X X Oil
001 X X 111 V C c T& C 110 X X Oil
000 X X Oil 111 X X 111
X 100 X 111 X Oil X 101
X 010 X 111 X 101 X 101
X 001 X 111 X 110 X 101
X 000 X 101 = -*■ X 111 X 111
X X 100 111 X X Oil 110
X X 010 111 X X 101 110
X X 001 111 X X 110 110
X X 000 110 X X 111 111
AT V . Y t S, X = OUTPUT AT I V :"TI
i.e. INPUT 111
12 CONDITIONS TO SATISY 
EXTENSIVE MERGING POSSIBLE
(a) OR/AND
10 CONDITIONS TO SATISFY 
NO MERGING POSSIBLE
<b> AND/AND
V t € T+
© © © ©
100 X X 100
010 X X 100
001 X X 100
000 X X 000
X 100 X 010
X 010 X 010
X 001 X 010
X 000 X 000
X X 100 001
X X 0)0 001
X X 001 001
X X 000 000
© © © ©
Oil X X 000
101 X X 000
110 X X 000
111 X X 100
X 100 X 010
X 010 X 010
X 001 X 010
X 000 X 000
X X 01 000
X X 10 000
X X 11 001
X * OUTPUT AT 1 
I.e. INPUT - 000
V t e T+
10 CONDITIONS TO SATISY 
NO MERGING POSSIBLE
(C) OR/OR
X - OUTPUT AT 1 
FOR(2)THIS IS INPUT - 000
11 CONDITIONS TO SATISFY 
SOME MERCINC POSSIBLE
(d) AND.OR/OR
FIGURE 6.9 TESTING 2-LEVEL STRUCTURES
-201-
e x o r(T)
INPUTS 

















0 1 a SAl.b SAO.c SAO 1 0 SAO 1 a
1 0 a SAO,b SA1,c SAO 1 0 SAO 1 A
1 1 a SAO,b SAO,c SA1 0 1 SA1 0 A
a = 0 OR 1, BUT 
REQUIRE V t € T ©
FIGURE 6.10 TESTING CASCADED EXCLUSIVE-OR GATES
1) START 
AT A 
V t € T&
2) START 
AT B 
V t € T+
3)
V t € T& 
FOR C
. $ « d e f , i
O i l  
1 0  1 
1 1 0  
I 1 1
0 0 0 
0 0 0 
0 0 0 
1 0  0
0 1 1 
O i l  





^ 1 0  0 
0 1 0  
0 0 1 
0 0 0
1 1 1  
1 1 1 










O i l  
1 0  1 
1 1 0  
1 1 1
<-* - TEST ALREADY EXISTS 
' 1' - OUTPUT AT 1, '0‘ - OUTPUT AT 0
A - OUTPUT AT 0 OR 1
(b)
FIGURE 6.11 TESTING MORE THAN TWO
CASCADED GATES













c*d = l) 00 «* 1
01 1 c*0) 01
I
11 fl A * a=d* lj 11 (Ti 1 '»
10 / J \ 10
t *
u 1 1












00 01 11 10
(a=0









FIGURE 6.12 EXPANSION OF A PRIME IMPLICANT (PI)
ab
cd CODE INPUTS 
00 01 11 10 * b c d
PRIME IMPLICANTS 
ad be
0 0 0 0 0 1 0  1 1 1 0  0 
1 1 1 0  
1 0  1 1
01 10 , MERGED




(°) '1 ij) &
0 1 (*) 0 r iV t 6 T& ♦ t ( TS*
ad + be C* no
(a)
@  In expansion of b 
in be, only a Q) 3
ab iCd00 01 11 10
00 0 0 0 0
01 0 ® f1
11 0 fl ij










0 1 0 1 01 10 -
1 0 0 1 11 00








C* no 11 in be since PI does 













0 1 0 1 01 10 ■*— |
1 1 1 1 11 11
1 0 0 1 11 00
1 0 1 0 10
f









C* V t € T& for both Pis, but no 
condition where bc*l and ad=0, 
since no ©  unique to bc3
FIGURE 6.13 NON TSC CHECKER DESIGNS
-203-
ab V O 
CL




01 0 (l r>j
11 0 <s> - ®
10 0 0 0 0
01 11 10
00 m 0 (°) &
01 j) 0 lil
11 (») 0 It) ij
10 0 0 0 0
01 11 10
00 ij 0 (\
01 _lj (°) 0
11 0 0 0 0
10 fo 0) 0
CODE INPUTS PRIME IMPLICANTS
a b c d ad be
1 1 0 1 01 10
1 1 1 0 00 11
0 0 0 1 11 00
0 0 1 0 10 01
y * ad + be (.
CODE INPUTS PRIME IMPLICANTS
a b c d ad be
1 1 0 0 01 10
1 1 1 1 00 11
0 0 0 0 11 00
0 0 1 1 10 01
7 * ad + be (1
CODE INPUTS PRIME IMPLICANTS
a b c d ad be
1 0 1 0 01 10
1 0 0 1 00 11
0 1 1 0 11 00
0 1 0 1 10 01












































(bl - T -




















TSC n x l-OUT-OF-2 
TO 1 X l-OUT-OF-2 
CONVERTER
(b)
FIGURE 6.16 TSC COMPARATORS
1\ 00 01 11 10
!b2
00 01 11 10
00 0 0 1 1 00 1 0 0 1
01 0 (0) 1 0) 01 0 (') 1 <8>
11 1 1 0 0 11 0 1 1 0






INPUTS GATE INPUTS OUTPUTS
al bl a2 b2 cd
ef *1 *2
0 0 0 0 00 00 0 1
0 0 0 1 00 01 0 0
0 0 1 0 01 00 1 1
0 0 1 1 01 01 1 0
o 1 0 Q 00 10 0 0
0 1 0 1 00 11 Q 1
0 1 1 0 01 . 1SL _ 1 0
0 1 1 1 01 11 1 1
1 0 0 0 10 00 1 1
1 0 0 1 10 01 . 1 0
1 o 1 o 11 00 o 1
1 0 1 1 11 01 0 0
1 1 0 0 10 10 l 0
1 1 0 1 10 11 l 1
1 1 1 0 11 10 0 0
















y_ - a,a_ + b.b.
(a) KARNAUGH MAPS
2 x l-OUT-OF-2















X - OUTPUT AT 0 
6 CONDITIONS TO SATISFY














































c d e f g h i j yl

































X INDICATES FAULT DETECTED 
< e) FAULT TABLE FOR SUBCIRCUIT A
FIGURE 6.18 TSC 2-INPUT AND/OR MORPHIC AND GATE
-206-
b2
00 01 n 10 ai bi \^
b2
00 ol 11 10
00 roj l ls_ 00 t?| oj 1 r
01 V2_ -<°> l 0 ) 01 0 ) 1 (»x
11 l l l 1 11 1 1 1 i
10 °1 (V l (W 10 fo­ 1 0

















*1 bl a2 b2 *1 *2
0 0 0 0 0 0
0 0 0 1 0 0
0 0 1 0 0 0
0 0 1 1 1 1
0 1 0 0 0 0
0 1 0 1 0 I
0 1 1 0 1 0
0 1 1 1 1 1
1 0 0 <? o 0
1 0 0 1 1 0
1 0 1 0 0 1
1 0 1 1 1 1
1 1 0 0 1 1
1 1 0 1 1 1
1 1 1 0 1 1
1 1 1 1 1 1
(c) TRUTH TABLE
INPUTS 
al bl a2 b2
GATE INPUTS
OUTPUT
ylcd © f <?,
0 1 0  1 00 11 01 0
0 1 1 0 01 10 11 1
1 0  0 1 10 01 11 1
1 0  1 0 11 00 10 0
(d) TESTING SUBCRCUIT A













*1* ■ 01 or 10 




al bl a2 b2 y21 y22
0 0 0 0 
0 0 1 1  
1 1 0  0 





0 0 0 1 
0 0 1 0  
1 1 0  1 





0 1 0  0 
0 1 1 1  
1 0  0 0 





0 1 0  1 
0 1 1 0  
1 0  0 1 


















0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 1 0 1 0
0 1 1 0 1 0 0 1 1 0
0 1 1 0 1 0 1 0 0 1
1 0 0 1 1 0 0 1 1 0
1 0 0 1 1 0 1 0 0 1
1 0 1 0 0 1 0 1 0 1
1 0 1 0 0 1 1 0 1 0
(b) TRUTH TABLE
FIGURE 6.21 3-INPUT MORPHIC AND GATE
-208-
SUBCIRCUIT A SUBCIRCUIT B
31
,2b2Nl3b3

















































































*2 a3 bl a3 c ? e f
0 1 0 1 0 1 0 1 1 1 0 1 0 0 0 1 1 0 0 0 0 0 0
0 1 0 1 1 0 0 1 0 1 0 0 0 0 1 1 1 1 0 0 0 1 1
0 1 1 0 0 1 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 0 1
0 1 1 0 1 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 0 0 0
1 0 0 1 0 1 1 1 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1
1 0 0 1 1 0 1 1 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0
1 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0
1 0 1 0 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 0 1 0 1
(C) TESTING SUBCIRCUIT A





(a) CASCADED 2-INPUT MORPHIC AND CATES
*41 42
(b) 2-LEVEL AND/OR STRUCTURE 
























2 x 1—OUT—OF—4 1 x 3—OUT—OF—4 
(b) MERGED





1 0 0 0 0 
0 1 0  0 0 
0 0 0 1 0 
0 0 0 0 1
1 0 0 0 0 ^
— *7 . 0 1 0 0 0 \
— ,"V>*— ooio o -1 y ~ ~ ~
 - 0 0 0 1 Q — ■ —J
FIGURE 6.25 TESTING A 4-INPUT OR GATE
UERCEO AND 
CATES
2 x 3—OUT—OF—4 1 x 1—OUT—OF—4





















- MORPHIC AND CATE
24 x 2-INPUT AND GATES 
18 x 2-INPUT OR GATES 
6 LEVELS
ys2
M+ - MERGED OR GATES 
MAc - MERGED AND GATES
16 x 2-INPUT AND GATES 
2 x 2-INPUT OR GATES 
8 x 4-INPUT OR GATES 
4 x 4—INPUT AND GATES 
4 LEVELS











n x DUPLICATION ENCODED PAIRS 1 x l-OUT-OF-2

























a b c d e f *1
0 0 0 0 0 0 0
0 0 0 1 0 1 1
0 0 1 0 0 1 1
0 0 1 1 0 0 0
0 1 0 0 1 0 1
0 1 0 1 1 1 0
0 I 1 0 1 1 0
0 1 1 1 1 0 1
1 0 0 0 1 0 1
1 0 0 1 1 1 0
1 0 1 0 1 1 0
1 0 1 1 1 0 1
1 1 0 0 0 0 0
1 1 0 1 0 1 1
1 1 1 0 0 1 1
1 1 1 1 0 0 0
t t t
V t € T ®
(b) GROUP 1 TRUTH TABLE
(a)
GATE INPUTS OUTPUT
*2g h 1 j k 1 D n
0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 1 1
0 0 0 1 1 0 0 1 1
0 0 0 1 1 1 0 0 0
0 0 1 0 1 0 0 1 1
0 0 1 0 1 1 0 0 0
0 0 1 1 0 0 0 0 0
0 0 1 1 0 1 0 1 1
0 1 0 0 0 0 1 0 1
0 1 0 0 0 1 1 1 0
0 1 0 1 1 0 1 1 0
0 1 0 1 1 1 1 0 I
0 1 1 0 1 0 1 1 0
0 1 1 0 1 1 1 0 1
0 1 1 1 0 0 1 0 1
0 1 1 1 0 1 1 1 0
1 0 0 0 0 0 1 0 1
1 0 0 0 0 1 1 1 0
1 0 0 1 1 0 1 1 0
1 0 0 I 1 1 1 0 1
1 0 I 0 1 0 1 1 0
1 0 1 0 1 1 1 0 1
1 0 1 1 0 0 1 0 1
1 0 1 1 0 1 1 1 0
1 1 0 0 0 0 0 0 0
1 1 0 0 0 1 0 1 1
1 1 0 1 1 0 0 1 1
1 1 0 1 1 1 0 0 0
1 1 1 0 1 0 0 1 1
1 1 1 0 1 1 0 0 0
1 1 1 1 0 0 0 0 0
1 1 1 1 0 1 0 1 1
(c >
V t € T ©
CROUP 2 TRUTH TABLE
FIGURE 6.30 TSC 9-BIT PARITY CHECKER
DECODER
n = 2
.FIGURE 6.31 m TO n LINE DECODER
-213-
C1
• 1-OUT-OF-n • k-OUT-OF-2k
TO • TO
k-0UT-0F-2k 1-OUT-OF-2
• CONVERTER • CONVERTER
C2k
C1
• 1-OUT-OF-n dl k x l-OUT-OF-2









FIGURE 6.33 A 2 TO 4 LINE DECODER WITH ENABLE
-214-
CODE INPUTS OUTPUTS
x5 X4 x3 X2 X1 C1 dl c2 d2 C3 d3
0 0 0 0 1 
0 0 0 1 0 
0 0 1 0  0 
0 1 0  0 0 






















































































«1 bj a 2  b2 ©  V,*21*22 "3 3
0 1 1 0  
1 0  0 1 
1 0  1 0  
0 1 0 1 
0 1 1 0
1 0  0 1 
1 0  0 1 
0 1 0 1 
0 1 1 0  
1 0  1 0
1  0 







FIGURE 6.35 DESIGN OF CIRCUIT C
NODE FAULT ENABLEg
POSSIBLE EFFECT ON


























































*4 *3 *2 *1 *2 el
0 0 p 0 p 0 p
0 0 0 0 1 p 1
0 0 0 1 0 0 1
0 0 0 1 1 0 0
0 0 1 0 0 1 0
0 0 1 0 1 1 1
0 0 1 1 0 1 1
0 0 1 1 1 1 0 4—
0 I 0 0 p 1 0
0 1 0 0 1 1 1
0 1 0 1 0 1 1
0 1 0 1 1 1 0 4—
0 1 1 0 0 . 0 0
0 1 1 0 1 0 1 4—
0 1 1 1 0 0 1 4—
0 1 1 1 1 0 p
1 0 0 0 0 1 0
I 0 0 0 1 1 1
1 0 0 1 0 1 1
1 0 0 1 1 1 0 4—
1 0 1 0 0 0 0
1 0 1 0 1 0 1 4—
1 0 I 1 0 0 1 4—
1 0 1 1 1 0 0
1 1 0 0 0 0 0
1 1 0 0 1 0 1 4—
1 1 0 1 0 0 1 4—
1 1 0 1 1 0 0
1 1 1 0 0 I 0 4—
I 1 1 0 I 1 1
1 1 1 1 0 1 1

























FIGURE 6.38 CLOCK PHASE GENERATION
AND CHECKING
T '<>" , toff ;
tp






FIGURE 6.40 TSC PERIODIC SIGNAL CHECKER
-218—
Foutt No. . J P ton . toff
1 C 1 1
2 C » !
3 » C ♦
4 1 c 1
5 \ » c
6 1 c
7 \ ! 1
8 1 1 J
9 t 1 1
10 1 I *
11 » 1 1
12 1 1 1
\ -  INCREASE 







m i j t
u I - J l
1 (c)L FAULT 1
t / p f
M1J"
M2
u. T  (d)1 L FAULT 2








l / p j
M1J"
M2
l / p j
M lj"
M2





































FIGURE 6.41 TSC PERIODIC SIGNAL CHECKER FAULTS AND WAVEFORMS
-219-
CHAPTER SEVEN : FAULT DETECTION AND DIAGNOSIS
IN SELF CHECKING SYSTEMS
7.1 : INTRODUCTION
The maintenance process for a computer system consists 
of the following five stages; fault detection, error 
correction, fault diagnosis/isolation, error recovery and 
repair. All of these stages are essential in fault 
tolerant systems, but error correction and error recovery 
are of particular importance 17.1-7.4,7.343. Fault detec­
tion and fault diagnosis form the basis of this Chapter.
The aim of this investigation is to use self checking 
techniques as a means of detecting and locating failures 
at circuit, chip or component level in a microprocessor 
based system. Whilst error correction and recovery are 
obvious developments from this aim, they are well documen­
ted elsewhere 17.5-7.7,7.353 and not considered here.
There have been many fault tolerant computers developed 
over the years. Although few of them have used self 
checking checkers, the self checking philosophies adopted 
within them are fundamental to systems of the type 
described in this thesis, which do employ such checkers. 
Fault tolerant computers in this category include:
1) STAR (Self Test And Repairing) 17.2,7.4,7.83 : One of
the earliest and most well known of fault tolerant 
computers, its development was started in 1961 at 
the Jet Propulsion Laboratory, under the guidance of 
Algirdis Avizienis.
2) ESS (Electronic Switching System) 17.5,7.9-7.123 :
Various models have been developed at the Bell 
Laboratories for telephone switching.
3) FTSC (Fault Tolerant Spaceborne Computer) 17.133 : Has
a 95% probability of surviving unattended and with­
out degradation for five years.
4) MECRA 17.3.43 : A self reconfiguring computer developed
-220-
in France.
5) Plessey System 250 C7.343 : A fault tolerant modular
processing system for control applications.
6 ) Sperry Univac 1100/60 and System 11 £7.7,7.151 : Both
are information processing systems.
The Chapter commences with a discussion on the positioning 
of self checking checkers within a system. Section 7.3 
then examines suitable error detection codes for data 
transfer paths and memory. Problems caused by selection 
and addressing errors are also investigated. Various 
error detection mechanisms are detailed in section 7.4 for 
the control paths of a system, which include clocks, 
control lines and decoders. Duplication, check bit 
correction and check bit prediction are the principal 
techniques considered for fault detection in arithmetic 
and logical operations in sections 7.5 and 7.6 respective­
ly. All of the fault detection mechanisms in sections 7.2 
to 7.6 are specifically for hardware faults. However, 
software faults can also occur, so some special hardware 
checks for these faults are outlined in section 7.7. 
Finally, fault diagnosis is the subject of section 7.8 
which includes fault indication, signal isolation for 
fault isolation, error indications from transient condi- 
ions, the verification and efficiency of fault detection 
mechanisms and marginal testing.
7.2 : POSITIONING OF CHECKERS
Sellers C7.163 gives criteria for the positioning of 
checkers in coded data paths as follows:
1) Undetectable errors and faults should be minimised.
2) An error should be detected before it corrupts other 
data, so that error indications are meaningful for 
maintenance and so that erroneous data can be recon­
structed.
3) The fault should be located to within the smallest 
amount of hardware necessary to ensure that servicing
-221
is straightforward.
4) The cost of the checker should be minimised.
There must always be trade-offs between the amount and 
complexity of checkers to satisfy the first three criteria 
and the minimum cost requirement of the last.
The position and quantity of checkers should therefore 
enable as many faults and fault types to be detected 
during normal operation of the system. The measure for 
such a detection capability is referred to as fault 
detection coverage. It can be readily determined by 
examining which component, gate, device or interconnection 
failures are detected by the fault detection circuits 
present.
Suitable positions for checkers in a network of self 
checking blocks are at the outputs of the network and at 
the inputs of each non code disjoint block. The position­
ing of checkers must ensure that a network is self testing 
and fault secure. Individual blocks are assumed to be 
self testing and fault secure. If the network satisfies 
theorem 5.1, then a checker is only necessary at its 
outputs. Network fault security says that all faults will 
produce the correct output or a non codeword, whilst 
network self test says that all faults will produce a non 
code output for some normal network input. If the network 
contains blocks which are not code disjoint, i.e. those 
which produce code outputs from non code inputs, then the 
network is not totally self checking (from theorem 5.1). 
However, the network can still be fully checked with a 
checker at its output if additional checkers are position­
ed at the inputs of these non code disjoint blocks. 
Wakerley, however, indicates that there need not be a 
checker on every input C7.173.
The two criteria above give sufficient theoretical posi­
tions for the checkers. In practice, additional checkers 
are positioned to improve the diagnosability of the
-222-
network. Sedmak C7.183 gives three categories which 
should have a high concentration of checkers, particularly 
if recovery procedures are to be implemented. These are:
1) High failure rate logic.
2) Critical logic and logic with a high usage during 
normal operation.
3) Storage elements which are accessed infrequently, or 
where the period between write and read operations is 
relatively long.
7.3 : DATA PATHS AND MEMORY
7.3.1 : Choice of Error Detection Codes
Chang C7.53 describes the use of a check register to 
detect multiple errors in data transmission and storage, 
which does not use an error detecting code and operates as 
follows. Source data is toggled into an initially all 
zero check register. After this data is transferred to 
its destination register, it is -immediately read back and 
toggled into the check register once more. The check 
register should then again contain all zeros. If not, an 
error is signalled. Time redundancy is used to perform 
the additional read operation.
Errors in the transmission and storage of binary data, 
instructions, and addresses can be detected in data which 
is encoded in an error detecting code. Codes used for 
this application are generally separable. They are chosen 
on the basis of their cost, their effectiveness in detect­
ing the set of likely failures and their impact on system 
performance. The implementation cost of a code can be 
determined by the number of redundant bits in the code and 
by the cost of the appropriate self checking checker. 
The effectiveness of the code can be principally determin­
ed by its minimum Hamming distance (see section 4.3.10), 
but knowledge of the sizes of its self checking tested and 
fault secure sets are also significant.
-223-
The logical design of data path circuits also plays an 
important part in determining which faults can be easily 
detected and require a minimum amount of error checking
hardware. Flip-flops, for example, which can be used to
form registers for data processing, need to be carefully 
designed so that logic faults do not result in indetermin­
ate (transient) errors which make diagnosis difficult or 
impossible. Kraft & Toy C7.19D show a flip-flop design 
which has a race condition when certain faults are 
present. They gives two possible solutions to this
problem; firstly, additional control lines for diagnostic 
purposes and, secondly, a circuit redesign to avoid this 
condition.
Thus the logic structure as well as the error detection 
code must be considered to achieve the most efficient 
circuit design from the viewpoints of both error detection 
and fault diagnosis. The effects of all possible faults 
must be considered and the error detection code chosen to 
detect all faults, or at least a* reasonable subset.
The circuits used for data paths and memory are generally 
bit(byte)-sliced circuits with no interaction between 
bits (bytes), except for control lines (data selection and 
addressing), which are considered separately in section
7.3.2. If data is encoded in an error detecting code and 
the circuit is non transforming, i.e. performing only data 
storage, transmission or switching (see section 5.3.1), 
then it is self checking, since the output is always a 
codeword with no failures present. The size of the tested 
and fault secure sets for such circuits depend on the code 
used and it is these sets which can be used to compare the 
effectiveness of each code in detecting faults. Fig. 7.1 
details this and other important properties for the
separable codes discussed in this Chapter. It is assumed 
that the normal input set of vectors test for all faults 
affecting a single bit(byte)-slice, with the relative
checker speeds and costs based on the use of SSI and MSI
-224-
TTL. For the non transforming circuits of examples 5.1 
and 5.2, this means that all input combinations are 
applied to each bit(byte)-slice during normal operation.
The simplest error dectecting code is single bit parity, 
discussed in section 4.3.1. This code has n data bits and 
n+1 bits overall. A non transforming circuit for parity 
encoded data is fault secure for all faults affecting only 
a single bit-slice and self testing for all faults affect­
ing fewer than all bit-slices, for reasons given in 
section 5.3.1. The circuit is thus fault secure for the 
smallest reasonable fault set, but self testing for a 
large fault set that contains almost all faults. Single 
bit parity has minimum redundancy and out of all codes 
with n data bits, the simplest totally self checking 
checker. From section 6.7, this checker is an p-bit EXOR 
tree and an q-bit EXOR tree, such that:
p + q = n + l ,  where p and q  )/ 1 (7.1)
A duplicated circuit is both self testing and fault secure 
for all faults that affect different combinations of bits 
in the two circuits. Duplication is therefore better than 
simple parity, in terms of fault security, but there are 
many faults which are never tested during normal opera­
tion. In particular, any double fault that affects a 
single bit-slice and its duplicate in the same way is 
undetectable. Wakerley C7.173 suggests that such a fault 
is more likely than a fault affecting all n+1 bits in a 
parity check code. Using this assumption, he shows that 
parity can be more effective than duplication in terms of 
its self testing ability. Duplication has the highest 
redundancy and requires an n-bit TSC equality checker, 
both of which can involve high costs. It is the most 
complete and efficient check for certain operations, as 
there are no constraints with the circuit for fault 
dectection. Duplication is used where fault types cannot 




A b-adjacent code, discussed in section 4.3.2, is capable 
of detecting all errors within a single b-bit byte and is 
therefore ideal for a b-bit sliced circuit. From section
4.3.2, the code has k x b-bit data bytes and a single
b-bit check byte. Its checker consists of b x p-input 
EXOR trees and b x q-input EXOR trees, such that:
p + q =  k + 1, where p and q >, 1 (7.2)
plus a b-input Morphic AND gate. Since the code detects
all single byte errors, a non transforming circuit for 
data in a b-adjacent code is fault secure for all faults
that affect a single byte-slice. Analogous to a single
bit parity checked circuit, this circuit is self testing 
for all faults that affect fewer than all byte-slices, 
since for each of these faults, there is a normal input 
which produces a non code output. However, it is not 
necessarily faults affecting all bits of all bytes which 
are undetectable. For example, a fault which affects bit 
i of every byte, such that the output from these bits is 
a codeword, is also undetectable. Overall, though, for 
applications where faults affecting all bytes of an
encoded word are unlikely, the b-adjacent code has the 
fastest and cheapest checker out of all the codes with b 
check bits discussed in this Chapter.
The low cost residue codes with check base A=2^-l,
discussed in section 4.3.6, are also capable of detecting
all errors within a single b-bit byte. Like the b- 
adjacent codes, they have b redundant bits, but a more 
costly checker, which employs a tree of b-bit adders. 
They do not, however, detect errors of magnitude 2^-1, 
so a byte which changes from 2^-1 (all l's) to 0 (all 
0's), or vice versa, is not a detectable error. As a
result, a non transforming circuit which employs a low 
cost code is fault secure for all single byte faults,
except all bits stuck at either 0 or 1. It is also fault 
secure for a large class of multiple byte faults producing
-226-
unidirectional errors C7.203. The circuit is self
testing for all faults that affect fewer than all bits, 
assuming there is no fault in a byte-slice whose only 
effect is to change 2^-1 to 0, or vice versa. Such a 
circuit, when compared to a similar circuit using a to­
ad jacent code, is thus self testing for a larger fault 
set, fault secure for a slightly smaller set, has the same 
redundancy, tout its checker is more costly and slower. 
The low cost residue codes can also be directly used to 
perform arithmetic operations.
Pradhan and Stiffler C7.213 discuss coding for LSI 
circuits and transient faults, with the conclusion that 
the most common faults are unidirectional. They also 
suggest that Berger codes will toe the most important code 
for the detection of unidirectional errors in future 
computers, because they are separable codes. In addition, 
Pradhan has proposed a new code which is not only compati­
ble with parity check codes and able to correct a limited 
number of errors, but is also able to detect certain 
unidirectional errors C7.21,7.223. Error correction is 
desirable to cope with and recover from the effects of 
transient errors. Traditionally, the memory unit of a 
computer has tended to be the most unreliable unit, so 
error correcting codes are also to be found there.
7.3.2 : Selection and Addressing Errors
The discussion so far applies to the detection of errors 
in data transmission and storage. The detection of errors 
in the selection mechanisms for data transfer, or the 
addressing of data in memory, also need to be considered.
Separate error detection schemes can be used for data and 
control. Parity on data detects all single bit errors 
within a register. It is assumed that all faults within 
the register will only affect one bit. Parity is checked 
on a data transfer, but does not guarantee that the data 
has been written to or read from the correct register.
-227-
Bit e r r o r s  during the write operation are detected when 
the data is read. This is adequate because erroneous data 
within a register is not a problem until it is read. 
Decoder faults affecting a complete word may not be not 
detected, so the control line outputs from the decoder 
need to be checked, see section 7.4.3.
Decoder faults manifiest themselves, in general, as 
multiple bit errors in a data word. A decoder check will 
eliminate the need to detect these errors. However, if 
the error detection code capability of the data is 
expanded to include the entire data word, a decoder check 
is not required.
A simple address decoder for m encoded data words of k- 
bits is shown in Fig. 7.2. A decoder translates the input 
address data into a 1-out-of-n code to store, or retrieve 
the desired word. These words are encoded in an error 
detecting code such that a storage failure or bus failure 
is detectable. A decoder failure, however, may result in 
the selection of the wrong word.' The erroneously selected 
word is a codeword, so the fault is not detected by 
checking its encoding.
Selection failures can be detected by dividing the memory 
into bit-slices and providing individual address decoding 
for each bit, as shown in Fig. 7.3. Now a failure in the 
selection circuitry produces only a single bit error in 
the word, which is detected by the error detection scheme 
for the data.
A single parity check bit will then detect all faults in a 
single bit slice and its decoder, and single bit faults in 
the data transfer. The circuit is fault secure for these 
faults. This technique is inherent in memory chips which 
are 1-bit wide. If the memory chips are b-bits wide and 
the error detecting code is capable of detecting b-bit 
errors, then the decoder circuitry only needs to be 
duplicated for each byte-slice. Suitable codes are the
-228-
b-adjacent (interleaved parity) checksum or Berger codes. 
For example, if the memory is 4-bits wide, the 4-adjacent 
code can be used whereby every parity bit covers one bit 
from each 4-bit slice. This is illustrated in Fig. 7.4. 
In all cases, the error detection code requires the memory 
to be one chip wider than normal.
It is assumed in all of the above discussion that if 
decoder input errors are to be detected, then the address 
information will also be encoded in an appropriate error 
detecting code and checked. The technique applied here 
for decoder circuitry is equally applicable to all other 
common parts of a memory system, such as driver circuits, 
refresh logic and bus interfaces. The checking of control 
lines is futher discussed in section 7.4.
The above scheme takes advantage of data encryption in an 
error detecting code. It is also possible to detect 
selection and addressing errors independently of the data 
error detection code. If, for example, selection failures 
cause the selection of a data word whose address is 
different from the desired address by an odd Hamming 
distance, then these failures can be detected if the 
parity of the address is stored with each word. The 
address parity bit is then compared with the actual 
address parity, when the word is retrieved from memory. 
The disadvantage of this scheme is that is requires an 
extra bit to be stored with each word and that it only de­
tects the class of errors described above. A circuit using 
this scheme is therefore not fault secure for failures 
that cause a double selection, two words ORed together, or 
selection of an erroneous word whose address is an even 
Hamming distance from the address of the desired word. 
This scheme has been used to detect addressing errors in 
microprogram control memory C7.53.
Parity can be replaced with any other separable error 
detecting code. The choice of code is determined by the 
type of addressing errors to be detected. Other codes
-229-
though, in general, will require more than one extra bit 
to be stored with each data word.
7.3.3 : Code Translation
Whilst it may be desirable to provide byte error detection 
for the buses of a microprocessor based system, the 
redundancy of these codes may not make their use in main 
memory or mass storage economically viable. A code with 
less redundancy (fewer check bits) may be more desirable 
for memory. This then requires encoded data to be 
translated from the memory code to the bus code when it is 
read from memory and back again when it is written to 
memory. Totally self checking (TSC) networks for perform­
ing this translation are now considered, where both codes 
are separable.
The interface between buses and memory can be accomplished 
by the TSC circuitry shown in Fig. 7.5. Data is checked 
in each direction by a TSC checker for the source code. 
Check bits are generated for the* destination code from the 
data bits in each case. This generation circuitry is self 
checking because a generator fault produces either correct 
check bits, or incorrect check bits making the output a 
non codeword.
The scheme of Fig. 7.5 requires two check bit generators 
and two TSC checkers. Each checker can be implemented 
using a check bit generator and a TSC equality checker, as 
described in section 5.4.2. As two check bit generators 
are already required elsewhere, the generators required 
for the checkers can be replaced with the additional 
control logic shown in Fig. 7.6. Multiplexers are used to 
connect the inputs of each check bit generator to the
appropriate bus, dependant on the direction of data
transfer. This scheme will not be cost effective unless 
multiplexers are cheaper than check bit generators. This 
in turn depends on the size of the data bus and the
complexity of the check bit generator for the particular
-230-
code used.
In Fig. 7.7, the scheme of Fig. 7.6 is modified to provide 
code translation for data on a bi-directional bus. Check 
bit tri-state drivers are required instead of data bit 
multiplexers, which should be cheaper.
Wakerley C7.173 suggests that additional savings can be 
made for certain combinations of bus and memory codes, 
where the memory check bits can be directly derived from 
the bus check bits.
Code conversion could also take place again for arithmetic 
and logical operation, where, for example, residue codes 
are more efficient for arithmetic operations than parity. 
However, the conversion logic must not be over complex, or 
add an excessive propagation delay to data transfers.
7.3.4 : Interleaved Coding
It is often desirable to retain unmodifed binary informat­
ion in data tranfers, so that subsequent processing of the 
data can be performed without any code conversions. In 
this case, a separable code must be used, the simplest of 
which is parity. Single parity will not detect a fault 
which causes an even number of bits to be in error within 
a data word, but it is still the simplest and most widely 
used check scheme. However, if multiple bit errors affect 
adjacent bits instead of being randomly dispersed through 
the encoded word, as Cook et al suggest C7.11D, then these 
faults can be detected by interleaving parity encoded data 
information and m-out-of-n encoded control information as 
in Fig. 7.8. Any multiple adjacent bit fault will affect 
both the binary data field and the ra-out-of-n codeword in 
the control field. Consequently a single bit parity check 
is adequate to detect single bit errors errors in the data 
field, since multiple adjacent bit errors will be detected 
by the m-out-of-n code check. This interleaved coding 
technique does not require any additional hardware to give
-231-
enhanced error detection.
7.4 : CONTROL PATHS
Previous sections have examined methods of detecting 
errors in data transfer, storage and manipulation in a 
computer. There must be some form of control unit to 
manage these operations. In some instances it may be 
necessary to duplicate the control unit for error de­
tection purposes. However, in general, there are certain 
aspects of control which do not require duplication. This 
section investigates error detection for a number of 
these aspects.
7.4.1 : Clocks
Errors can occur in the single or multiphase clocking 
signals which control the various stages of an operation, 
both within the CPU and I/O devices and between the 
components of a microprocessor based system (CPU, I/O, 
memory and discrete logic). Possible clock errors are 
period and mark-space variation, phase overlap and stuck- 
at faults. The methods used by Chang for checking 
multiphase clock signals have already been described in 
section 6.9. Alternatively, a single phase clock may be 
used and checked with the TSC periodic signal checker, 
also described in section 6.9. If multiple phases are 
required from this arrangement, they can be derived from 
a faster single phase clock in conjunction with a small 
self checking sequential circuit.
7.4.2 : Control Lines
Control lines can be grouped together and encoded in an 
error detecting code just like data lines. The choice of 
code will be dependent on the level of fault detection 
required versus the redundancy and cost of the checking 
scheme. Whilst this method checks the transmission of 
control signals, it does not guarantee that they reach
-232-
their destination correctly. For example, suppose one of 
the control lines is used to gate bus A on to bus T, as 
shown in Fig. 7.9. Data on the buses is assumed to be 
encoded in an error detection code. A break in the 
'T 4- A' control line affects multiple bits on the T bus 
and may not be detected by the data path error detecting 
code. A break at point a in Fig 7.9 will be detected by
some input combination on bus A, since it does not affect
all bits of the T bus. However, a break at point b can
produce an all 0 output for some logic families and will 
never be detected if all 0's is a codeword in the data 
path error detection code.
Errors caused by broken lines might seem improbable when 
considering interconnection wires or PCB tracks, but bad 
connections in card connectors and integrated circuit 
sockets, plus wire breaks between bonding pads and pins 
in integrated circuit packages are all possible faults. 
A means of detecting errors caused by these failures
should therefore be available.
Fig. 7.10 shows a solution to this problem, whereby the
control lines are routed to their various destinations and 
then regrouped for checking purposes at the end of their 
transmission paths, i.e. after all fan-out points. The 
checker monitors the data bits of the control word at this 
point and the check bits at their termination point. Now 
a failure in the control line(s) is detected by this 
means, whilst a failure in the same line(s) to an individ­
ual bit-slice is detected by the data path error detection 
code, as shown in Fig. 7.10. If the data path code is 
byte error detecting, then one line may be tapped off the
main control line for each byte-slice.
In the PSC networks of Figs. 5.20-5.23, an inverter may be 
used to generate s2 from s^. A failure in this inverter 
will be detected by the checking mechanisms for the
network, whilst a failure in the main control line 
(affecting both s2 and s^) will be detected by the control
-233-
line checking mechanism outlined above.
7.4.3 : Decoders
The example in the previous section used a single control 
line to provide a particular operation. In practice, 
there may be many of these lines generated internally, or 
externally to an integrated circuit for data selection or 
addressing. If only one line is active at any time, then 
collectively the lines form a 1-out-of-n code. It is
word, as in the case of address decoding for a micro­
processor based system, see Fig. 7.11.
Decoder faults are mostly unidirectional. The errors, as 
indicated in Fig. 6.36 for the circuit of Fig. 6.33, take
two forms. Either the selected output changes from a 1
(active) to a 0 (inactive) and there is no active output
supplied, or one or more unselected outputs change from a
0 to a 1, giving a multiple output error.
Consider the case of address decoding within a read/write 
memory chip. When reading data from the chip, decoder
failures can cause the contents of two memory locations to
be gated on to the data bus, usually producing the logical 
OR of the two words. This is a unidirectional error. The 
circuit will be self testing for this type of failure, 
since, in general, the logical OR of two codewords is a 
non codeword. However, it is not fault secure since an 
erroneous codeword could be produced. When writing data to 
a particular location, failures in the address decoder can 
cause data to be stored in two locations, with erroneous 
data replacing the data in one of them. This error may 
never be detected (when reading the data) because the new 
data is a valid codeword. In addition, decoder failures 
could result in no access to any memory location for both 
read and write operations. If the memory outputs all l's 
(or all 0's) for this failure (a unidirectional error) and 
all l's (or all 0's) is a codeword, then the failure is
likely that they will have been derived from a
-234-
undetectable. Some means other than encoded data must 
therefore be provided for the detection of decoder 
failures.
A straightforward check for decoder errors is to dup­
licate the decoder and compare the outputs of the two 
circuits. If a fault exists in either decoder, it will be 
detected by the comparator. The major disadvantage of 
such a scheme, (aside from identical output faults in both 
circuits) is that a comparator with a large number of 
inputs could be required. The alternative means of check­
ing for decoder errors is a 1-out-of-n code checker.
The design of two TSC 1-out-of-n code checkers has already 
been described in section 6.8. A self testing only 1-out- 
of-n code checker is detailed in section 5.6. Both 
Wakerley C7.23U and Kraft & Toy C7.19D present further 
designs for self tesing only 1-out-of-n code checkers. 
These compare input data to the decoder with re-encoded 
outputs from the decoder. Wakerley also shows that this 
scheme can become TSC if the inputs to the decoder are 
encoded in a k-out-of-2k code. TSC checkers for m-out-of- 
n codes in general have also been proposed, see section 
6.10 for references.
Comments made in section 7.3.2, on the detection of data 
selection errors, also apply here. However, the technique 
of decoder replication may be more expensive than the self 
testing only and TSC checking schemes discussed above. 
Decoder replication is practical if a bit or byte-sliced 
register bank, multiplexer, demultiplexer or functional 
unit is implemented using conventional MSI or LSI chips 
which have on-chip decoders.
7.5 : ARITHMETIC OPERATIONS
The basic arithmetic operations are addition, subtraction 
and shifting. Iterative operations, such as multiplica­
tion and division, are accomplished in software or firm­
-235-
ware by repeated use of these basic operations. However, 
as the scale of integration continues to increase and the 
cost of hardware continue to decrease, hardware multiplier 
and divider packages become more attractive for systems 
which require high performance operations. This section 
briefly discusses error detection for the basic operations 
using error detection codes. Errors in the iterative 
operations manifest themselves as errors in the basic 
operations, or as control errors. Totally self checking 
checkers are used in all cases.
Typically, binary adders are implemented with either 
ripple carries or with look-ahead carries, or both. The 
choice of implementation is governed by the cost of 
associated hardware and the speed of the arithmetic 
operation required. The error characteristics of a binary 
adder depend heavily on its structure. The fault detec­
tion scheme must therefore take into account the resultant 
error caused by logic faults within the adder.
There are several error detecting codes which can be used 
to directly implement addition and subtraction. The AN 
codes, as described in section 4.3.5, represent one of 
these. Low cost AN codes have check bases of the form 
A=21)-l. In this case, the modulo 2^-1 addition is 
performed using ordinary binary adders with end around 
carry. This is standard one's complement addition. 
Avizienis C7.243 gives a set of algorithms for multipli­
cation and division of AN encoded data.
The low cost residue codes can also be used for direct 
implementation of addition and subtraction, with the added 
advantage of being separate. Wakerley C7.25D gives 
algorithms for arithmetic operations on low cost residue 
encoded data. In many of these operations, the check bits 
must be adjusted in order to produce the correct codeword. 
This is achieved by adding (or subtracting) bits, derived 
from either the data or carry parts of the result, to the 
check bits. The circuitry which controls the check bit
-236-
correction must also be checked. Examples of residue 
checked adders are given by Kraft and Toy C7.193.
In some cases the specialised circuitry required for check 
bit correction and the propagation delay it imposes, has 
resulted in designers choosing duplication for a checked 
ALU design in favour of this technique C7.263. Duplica­
tion is definitely a better choice if the ALU is fabricat­
ed on one chip as opposed to byte-slices.
Non arithmetic codes are not preserved by arithmetic 
operations; i.e. an arithmetic operation performed on two 
codewords in general produces a non codeword. However, if 
a separable code is employed, only the check bits result­
ing from the sum of two codewords are incorrect. New 
check bits can be generated from the data part of the sum. 
Check bits generated in this manner, though, cannot 
indicate errors within the addition operation, because 
they are derived from the result alone (correct check bits 
will be generated from an erroneous result). If, however, 
new check bits are generated from the check bits of both 
inputs and the resultant data bits, then errors in the 
addition process will produce errors in the generated 
check bits. This method is called check bit prediction, 
previously mentioned in sections 4.3.1 and 4.3.4. The 
amount of circuitry required to generate new check bits 
after a logical or arithmetic operation is dependent on 
the error detecting code used. Wakerley C7.253 presents 
techniques for the application of check bit prediction 
with parity and checksum codes. In both cases, carry 
generation circuitry must be duplicated to detect faults 
within it. He also examines carry look-ahead structures. 
Kraft and Toy C7.193 also detail techniques for parity 
prediction in ripple carry adders and look-ahead carry 
adders, again stressing the need for duplicated carry 
circuits.
Binary counting is one of the most important operations 
within a processor. Its program counter is the most
-237-
obvious example. There may also be a requirement for 
counters in specific application circuitry. In the design 
of self checking processors or systems, these counters are 
often checked by parity prediction, whereby the parity of 
the next count is calculated and subsequently compared 
with the actual parity for that count. Kraft and Toy 
C7.193 describe parity prediction for counters in some 
detail. Self checking counters as well as shift registers 
are also discussed by Pradhan C7.273.
7.6 : LOGICAL OPERATIONS
Parity codes are preserved by the EXOR operation and a 
TSC network for computing the EXOR of two k-bit parity 
encoded words consists of k x 2-input EXOR gates, as Fig. 
7.12 taken from reference C7.28D shows.
The duplication code is also preserved by the EXOR 
function, as well as all other logical functions. A 
totally self checking logical network can be designed by 
simply duplicating the functional unit. The output 
code space of such a network is the duplication code 
S = C<DD'>|D=D'1, shown in Fig. 7.13. The network is self 
testing and fault secure for all multiple faults that 
affect disjoint sets of output in the two logic units. A 
fault which has the same effect on both logic units is 
undetectable. The network is neither self testing or 
fault secure for these faults. In addition to the two 
logic units, the TSC network has a TSC equality checker, 
as shown in Fig. 7.13. This indicates an error when the 
outputs from the two logic units are different.
The outputs from the duplicated logic unit in Fig. 7.13 
go no further than the equality checker, for the purposes 
of the TSC network. If encoded outputs are to be main­
tained, these outputs could be extended. However, a bit 
(or byte) error detection code, such as parity, may be 
adequate for the data transmission paths (see section 
7.3). In this case, a TSC code translation will be
-238-
required to convert the duplication code to say a parity 
code, as indicated in Fig. 7.14. The network of Fig. 7.14 
is a totally self checking check bit prediction scheme. A 
TSC code translator can be formed by using the unmodified 
outputs of one logic unit to supply the output data bits 
and using the outputs of the duplicated unit to feed a 
check bit generator, which in turn supplies the output 
check bits, as shown in Fig. 7.15.
All the above schemes discard the check bits of the input 
codewords. They are, therefore, not code disjoint, since 
a non codeword input will produce a codeword output. 
Checkers are thus required at the logic unit inputs, so 
that input errors can be detected. Alternatively,
duplicated logic units which are code disjoint must be 
designed. Wakerley C7.293 does this for AND and OR 
operations using parity, residue and checksum codes.
Wakerley C7.29D also discusses a method of pre-checking 
logical operations using any error detection code. In 
this technique errors occurring -during a logical operation 
are not detected, but faults that could affect the 
operation are tested immediately before it takes place. 
The disadvantage of this method is that it increases 
the time to perform such operations (time redundancy).
A development from the concept of pre-checking is that of 
partially self checking (PSC) circuits and networks, 
discussed in section 5.5. The PSC logic unit described in 
section 5.5.4 preserves even parity n-bit input vectors 
for a number of its functions. If the normal input set to 
the unit contains a set of code preserving operations 
which test all single faults in each bit-slice (see 
section 5.5.4), then the circuit is TSC for this set. 
Thus all faults can be tested by using a certain set of 
functions. The unit is not self checking for faults in 
the control lines before they fan out to individual 
slices, nor for single slice faults such as input diode 
shorts, that affect control lines everywhere within the
-239-
slice. However these faults can be readily detected by 
the control line checking techniques discussed in section 
7.4. The circuit of Fig. 5.24a can be used in the design 
of a PSC universal logic unit. The checker will be
enabled for code preserving operations and disabled for 
non code preserving operations, such as AND and OR.
Kraft and Toy C7.193 also examine the circuit of Fig.
5.24a, concluding that A ©  B and its complement A ©  B are 
capable of exercising every node. Consequently, 
the parity check for each Boolean logic function needs to 
be performed on only the EXOR function and its complement.
If certain logical operations are checked, other logical 
operations can be performed using these checked operations
in conjunction with checked arithmetic operations and a
set of identities. For example, if the AND operation is 
checked, other logical operations can be performed using 
identities such as C7.29D;
A OR B = A plus B minus A AND B
A EXOR B = A plus B minus 2 (A AND B)
NOT A h  minus A (l's complement)
NOT A =  minus A minus 1 (2's complement) (7.3)
In this case, the AND operation could be checked by 
duplication and the arithmetic operations checked by
residue codes. Another possibility is to use the parity
check code, so that the EXOR operation can be performed by 
a simple TSC circuit (Fig. 7.12), with parity check bit 
prediction for the arithmetic operations. Other logical 
operations can then be computed using an identity such as 
C7.293:
A AND B = 1/2(A plus B minus (A ©  B)) (7.4)
7.7 : HARDWARE CHECKS FOR SOFTWARE FAULTS
Aside from hardware faults, software faults can also cause 
errors in a system. However, software failures may be
-240-
more difficult to handle, particularly when little used 
branches are executed, or when data relevant to a program 
is corrupted. Hardware checking procedures, as described 
in previous sections, are designed specifically to detect 
physical failures in the hardware. They will not, in 
general, be capable of detecting software faults. Dupli­
cate CPUs will not help either, because a software fault 
will be executed in the same manner by both processors. 
In real time applications it is desirable to recognise 
software faults rapidly and take the necessary corrective 
action to remove them from the system. Three techniques 
to detect software faults are as follows:
1) Watchdog Timer.
2) Branch Allowed Check Bit.
3) Different Parity For Instructions and Data.
A watchdog timer is a hardware timer which runs continu­
ously. It is periodically reset by the main program. If, 
due to a hardware or software fault, the timer is not 
reset, it times out and appropriate action is taken, such 
as a high priority interrupt to the processor.
A branch allowed check bit is a means of detecting the 
execution of an improper branch operation. A check bit, 
called a branch allowed (BA) bit, is assigned to each word 
in memory. If the BA bit contains a 0, for example, the 
contents of that location may not be referenced by any 
branch instruction. This is illustrated in Fig. 7.16. 
The bit is checked during every branch operation, with a 
branch to a word where BA=0 generating an error.
If a parity check bit is assigned to each word in memory, 
the parity bit can be used to distinguish between instruc­
tion and data words. This can be effected by assigning 
odd parity, for example, to an instruction and even parity 
to a data word, as shown in Fig. 7.17. A data word 
referenced as an instruction, due a software or hardware 
faults, is readily identified and appropriate action taken.
-241-
7.8 : FAULT DIAGNOSIS
Fault detection determines whether or not a circuit is 
behaving correctly. Fault diagnosis identifies the
failure to a replaceable unit. This replaceable unit may 
be a component, circuit, board or subsystem. Traditional­
ly, a fault diagnosis routine uses fault detection 
hardware and test sequences to locate a defective unit.
If the units are designed in a self checking manner, 
hardware check circuits detect a fault and identify the 
defective unit down to the lowest level desired. The 
level of diagnosis is dependent on the quantity and 
position of checkers. If duplication and comparison is 
used for circuits and devices, then a mismatch indicates 
a failure, but the fault cannot be diagnosed to the 
functional unit or its duplicate. Thus, both will need to 
be replaced on detection of an error.
A replacement at board level may* be justified on the basis 
of a faster repair time for the faulty unit, or the cost 
savings of not having to design additional fault diagnosis 
hardware and software. However, when a board is replaced, 
the integrity of the replacement unit must be verified to 
establish that it does not also possess a fault. It is, 
therefore, necessary for the complete system to be 
extensively tested. However, if the self checking
circuitry is designed to diagnose faults down to the 
smallest level possible (component), then this will be 
unnecessary, as the system will indicate if the replace- 
ent device or board is defective in use. It may take 
quite some time, though, to generate all vectors in normal 
operation so that each unit and each checker are self 
tested, so particular test programs may be run which are 
specifically designed to exercise the system for this 
purpose.
In the checking schemes of sections 7.3 to 7.6, errors are
-242-
detected concurrently with normal system operation. It is 
assumed that this error detection occurs before the fault 
contaminates other parts of the system, making diagnosis 
difficult. The amount of time which elapses between the 
occurrence of a fault and its detection is dependent on 
the position and quantity of checkers, plus the level 
of fault detection provided by each individual technique 
adopted.
7.8.1 : Fault Indication
The l-out-of-2 outputs from each checking circuit (equal­
ity, 1-out-of-n and periodic signal checkers etc. ) 
can be combined using an n-input Morphic AND gate to 
produce a single l-out-of-2 pair, which indicates an error 
free or failed system. This signal can then be used to 
halt or interrupt the processor, store all error signals 
for subsequent analysis, restart the processor and re­
initialise or reconfigure the system.
Error signals from each fault detection circuit provide 
detailed diagnostic information about the system. This 
information has to be acquired from the system. This can 
be achieved by the error signals setting flip-flops in a 
register (with appropriate test mechanisms if it is not 
self testing). The contents of this register can be 
accessed externally and/or displayed on a series of 
indicator lamps, such as light emitting diodes (LEDs). A 
fault table can be generated to locate a fault to the 
desired level from a given set of error indications. This 
assumes that the fault detection circuit has the ability 
to detect faults down to the lowest level required.
The error condition in the l-out-of-2 pair is represented 
by the vectors <00> or <11>, so a single EXOR gate can be 
used to distinguish error free and faulty vectors, as 
indicated in Fig. 7.18. However, the EXOR gate is not a 
self testing circuit, in that certain failures within the 
gate will mask an error signal from the checker.
-243-
(Theorem 6.4 gives the conditions for.an EXOR gate to he 
self testing.) The most obvious failure in this category 
is the EXOR gate output stuck-at one, indicating a 
permanent error free condition, despite the outputs of the 
checker. It is therefore essential that the EXOR gate is 
checked periodically to ensure that it is operating 
correctly. An input vector must be applied to verify that 
it can generate an error indication. The vector should 
thus be <00> or <11>. This can be effected by a special 
test signal. Fig.7.19a shows one such mechanism, which 
requires an additional EXOR gate. Under normal conditions 
the test input is at 0. This allows the <01> and <10>
vectors from the check circuit to pass directly to the 
output EXOR gate, as in Fig. 7.18. The test input set to 
a 1 primarily confirms the operation of the output EXOR 
gate, but also checks the additional EXOR gate. Fig. 
7.19b gives the fault table for these two gates. Now with 
<01> or <10> output vectors from the checker, the input 
vectors to the output EXOR gate are <00> and <11>
respectively,which means that this gate should output a 0.
7.8.2 : Fault Isolation
In general, there will be a fan-out from every logic gate 
output, i.e. every gate output will be connected to at 
least one gate input. This is illustrated in Fig. 7.20a, 
where a single transmitter (gate output) sends data to 
many receivers (gate inputs). Another configuration is 
possible, shown in Fig. 7.20b, where a single receiver
accepts data from one of several transmitters. The
transmitter will be either wire ORed (or wire ANDed) 
together, or have tri-state output buffers. This latter 
configuration is generally associated with bus structures.
Consider now the encoded and checked bus structures of 
Figs. 7.21a and 7.21b. For diagnostic purposes the output 
of each buffer block is checked for failures. The fault 
tables in Figs. 7.21a and 7.21b identify the faults that 
are detected by each checker in the circuit. These reveal
-244-
that any input or output fault that causes the bus to 
become stuck at a particular logic level is undiagnosable. 
For example, in Fig. 7.21a, if the bus becomes stuck-at-0 
(SAO), it is unknown from the checker indications if this 
is caused by the output of buffer B1 SAO, or the inputs 
of buffers B2 or B3 SAO. The only solution, aside from 
using a current probe to determine which device is sinking 
current and therefore faulty C7.303, is to replace each 
device connected to the bus in turn, until the fault is 
eliminated. Sedmak C7.63, incidentally, investigates the 
structure of Fig. 7.21a, but does not consider bus 
failures caused by gate input failures.
An alternative solution is to introduce signal isolators 
at strategic points in the signal lines, so that there is 
only one output or input connected directly to the bus. 
Then a stuck bus line (assuming that the lines themselves 
are not open or short circuit) can only be due to the 
directly connected device. The structures of Fig. 7.21a 
and 7.21b now become those of Figs. 7.22a and 7.22b 
respectively. The isolation circuits are allowed to fail, 
but not in such a mannner as to cause a stuck-at failure 
where they are connected to the bus. An input to output 
short is another failure which is not allowed. Either of 
these failures would render the isolator circuit useless.
The fault tables in Figs. 7.22a and 7.22b now reveal that 
faults in individual devices are diagnosable to that 
device from checker indications. The additional checkers 
(C3 and C4) shown in Fig. 7.22b serve two purposes; 
firstly they detect isolator faults and secondly they 
improve the diagnosability of this structure. Isolator 
failures are modelled as producing identical fault indica­
tions as the single input or output of the device they 
are connected to and isolate. Each isolator and the 
device it isolates are then considered as one unit and 
replaced as such. This is an ideal situation, which in 
practice is dependent on the actual implementation of the 
isolator circuitry.
-245-
Moreira de Souza et al, in their research oriented 
computer C7.313, have used a series resistor and CMOS 
buffer as an isolator for each signal line in Fig. 7.22a 
and a series diode for each signal line in Fig. 7.22b,
resulting in Figs. 7.23a and 7.23b respectively. The 
resistors used are wirewound types, where the probability 
of an open circuit is much greater than the probability of 
a short circuit. The CMOS buffers ensure that minimal 
voltage is dropped across the resistors, so that an
acceptable noise immunity is maintained. The tie-up
resistors on the transmitter outputs in Fig. 7.23b
maintain odd or even parity for the checker between a 
transmitter and its isolator when that transmitter is in­
active. Appendix B shows that both short and open circuit 
diode failures are detected in this structure.
In the STAR computer C7.43, Avizienis used the arrangement 
shown in Fig. 7.24 for input and output isolation from a 
bus. Unfortunately, the series static redundancy employed 
for the isolation components is -useless, since a failure 
within them cannnot be detected. If, for example, one of 
the output diodes become a short circuit, then the isola­
tor is still functional, but the failed diode must be 
detected before the other diode fails as well. Avizienis 
does not indicate how this is effected, if at all.
It is envisaged that these isolators would, ultimately, 
become integrated, or at least gate circuitry redesigned 
so that physical input and output stuck-at failures could 
not occur. Wirewound resistors are not easily integrated, 
so an alternative isolator to that in Fig. 7.23a had to be 
designed for the self checking system of Chapter 9. The 
circuit adopted is that in Fig. 7.25.
Both Moreira de Souza and Avizienis only consider isola­
tors for unidirectional buses. In Fig. 7.26a, the 
structures of Fig.7.22 are merged to from the structure of 
a bidirectional bus. The self checking system of Chapter
9 has a bidirectional data bus, so a bidirectional 
isolator was required for this purpose. This is shown in 
Fig. 7.26b. The circuits of Figs. 7.25 and 7.26b are 
derived in Appendix B.
7.8.3 : Transient Error Indications
The outputs of each checker and/or the combined error 
signal pair can be monitored continously or loaded into a 
register at certain defined intervals, for example on an 
edge of a system clock. If continuously monitored, there 
may be certain regular transient errors which occur during 
the time interval when a number of signals are changing 
state.
An example of this is the decoder in Fig. 6.33. Assume 
that its inputs change from <11> to <00>. However, prim­
arily because of variations in the gate delays encountered 
by each line, the inputs may go through the following 
transitions: 11 —> 01 —> 00. These transitions give the 
decoder output waveforms shown in Fig. 7.27 and result in 
a transient spike from output y2» These waveforms 
indicate that the decoder generates a 1-out-of-n code 
output at all times, but the spike may cause a transient 
error in another part of the system. Assuming they are 
not a problem, these transient errors will be ignored if 
the checker outputs are sampled only when the system 
signals are known to be stable. Alternatively, if checker 
outputs are monitored continuously, the transient error 
indications can be eliminated by filtering the checker 
outputs, to remove transient errors of less than a defined 
duration, or by preventing the transient spikes from 
occurring. The latter solution is effected by inhibiting 
the decoder outputs until the input signals are stable. 
Alternatively, if the input data to the decoder is 
encoded, the code can be chosen such that all possible 
intermediate transitions between codeword inputs are non 
codewords. In this case output spikes do not occur. 
M-out-of-n codes are suitable for this purpose C7.193.
-247-
7.8.4 : Verification and Efficiency of 
Fault Detection Circuits
As indicated in the section 7.8.1, any non self testing 
section of fault detection hardware or subsequent error 
processing circuitry must be periodically exercised to 
ensure that it is fully operational. This process uses 
hardware to simulate test conditions or circuit faults and 
software to control it. The error detection circuits can 
be checked automatically by generating an error condition 
and observing their response. For parity and m-out-of-n 
code checkers, for example, a non codeword needs to be 
generated by the system. This can be effected by applying 
special test signals as stimuli to the check hardware, so 
that an error is created.
In reference C7.53 Chang describes mechanisms to exercise 
check circuits. The data transfer and storage parity 
checkers are exercised by reading idle or power on memory 
locations, i.e. memory which has not been written to. 
Hopefully, some of these will contain words with incorrect 
parity. Several words with incorrect parity are included 
in the system ROM or EPROM for a similar purpose. A 
1-out-of-n decoder checker is exercised by selecting an 
unused decoder output which, by design, is not an input to 
the code checker. The checker is therefore given the 
impression of no output selected. A second test is the 
selection of another unused output, which in this in­
stance, again by design, is fed twice into the checker. 
The checker is now given the impression of two active 
lines.
The Sperry Univac 1100/60 uses a combination of software 
and hardware to verify that fault detection, isolation and 
recovery mechanisms are operational C7.73. This
capability is provided by fault injection, a process of 
deliberately causing a fault to occur within a system by 
the insertion of erroneous data or control signals into
-248-
a portion of logic covered by a fault detection circuit. 
The need for such testing arises because the section of 
logic which processes the error signals is not frequently 
exercised during normal operation and is not self testing.
Fault injection is also the subject of a paper by Crouzet 
and Decouty C7.323. In order to determine the effective­
ness of the fault detection mechanisms in their research 
computer , described in section 8.3.10, they inject over
75,000 faults into it. The efficiency of the fault
detection mechanisms, defined as the probability of 
detecting a fault knowing that one has occurred, was over 
99%. A measurement of the safety of the detection
mechanisms, defined as the probability of the system 
producing error free output data, knowing that a fault has 
occurred, was 100%. Both test and application programs
where used whilst injecting faults, with the test
program giving a higher rate of fault detection, as 
expected. This type of analysis highlights weak points 
in the design for self checking. In the case of the 
research computer, these were found to be multiplexed bus 
checking and the implementation of Morphic logic func­
tions.
7.8.5 : Marginal Testing
In most of the discussion so far on error detection, it 
has been assumed that circuit faults were hard faults. 
These faults are always repeatable, since the same errors 
will always be generated under the same set of operating 
conditions. However, a logic gate or circuit as well as 
failing instantaneously due to, for example, electrical 
overstress, can also fail gradually, in that its operating 
voltage and current levels no longer fall within specified 
tolerance levels over a period of time. This gradual 
degradation creates maginal operating conditions, which 
lead to transient errors, as the circuit appears to be 
functioning correctly for most of the time. In addition, 
such marginal conditions may cause failures which are not
-249-
repeatable under the same set of circuit conditions. 
These faults are therefore extremely difficult to identi­
fy. Kraft & Toy C7.333 consider aspects of marginal 
testing, whereby voltage threshold levels are adjusted, 
dynamic memory refresh rates reduced and clock rates to 
the CPU increased.
7.9 : SUMMARY AND CONCLUSIONS
Section 7.2 has considered the positioning of checkers in 
a network of self checking blocks. However, whilst these 
may be the minimum requirements to detect all faults from 
a specific set, many additional checkers are required in 
practice, depending on the level of fault diagnosis 
required. In the experimental computer of Chapter 9, for 
example, fault diagnosis is desired at chip or circuit 
level.
Fault detection in data transmission paths is effected 
with an error detecting code. The level of fault detec- 
ion is dependent on the code chosen. The choice of code 
is also governed by the amount redundancy it invokes and 
the cost of generating and checking it. If different 
codes are adopted for data paths and memory, then code 
translation circuitry will also be required.
Faults in the control circuitry of memory can be detected
without the need for a separate check, if either the 
memory is organised in bit or byte-slices, for bit and
byte data error detecting codes respectively, or the 
check bits of the address error detecting code are stored 
with the data and checked when the data is retrieved from 
memory.
If the information on the data transmission paths is 
parity encoded, then the detection of errors in this
information can be improved above that of parity, by 
interleaving it with encoded information whose code has 
better error detection capabilities than parity.
-250-
Control lines must also be checked. Some control line 
failures will produce errors in the data transmission 
paths and be detected indirectly. Other faults will not 
be detected in this manner, so the control information 
must be encoded and checked. Decoder outputs, for 
example, which are inherently 1-out-of-n encoded, must be 
checked with a totally self checking 1-out-of-n checker. 
The checking scheme for control lines must always include 
a checker at the end of their transmission paths, to 
ensure that all breaks in these lines are detectable.
Circuitry performing arithmetic operations can be checked 
by duplication and comparison, by encoded inputs where 
the error detecting code is preserved by the arith­
metic operation(s) (with additional circuitry for check 
bit correction), or by encoded inputs with check bit 
prediction circuitry. Chapters 8 and 9 indicate that 
duplication and comparison is the most widely used 
technique. Despite its redundancy, the technique requires 
a minimum of design effort to implement.
Circuits performing logical operations can be checked by 
duplication and comparison, by encoded inputs using a set 
of code preserving operations which test all single faults 
in each bit-slice, or by encoded inputs with check bit 
prediction crcuitry. Alternatively, if certain logical 
and arithmetic operations are checked, other logical 
operations can be performed via a set of identities using 
these checked operations. Again duplication and comparison 
is the most widely used technique.
Three hardware checks for software faults have been 
described. These are a watchdog timer, a branch allowed 
check bit and different parity for instructions and data. 
However, there appears to be little attention paid to 
software faults^ and therefore these tests^ in the self 
checking systems reviewed in Chapter 8.
-251-
The level of fault diagnosis is dependent on the quantity, 
position and checking ability of the self checking
circuits. All l-out-of-2 outputs from the checkers can be 
combined to produce a single l-out-of-2 pair, which will 
take some action on detection of an error, but all
individual outputs are required for diagnostic purposes. 
The individual signals and/or the combined signal may also 
be visually displayed. The diagnostic and display
processes will involve some non self testing circuitry. 
Provision must be made to periodically test this cir­
cuitry.
Although the fault detection logic is self checking, it 
can be checked, along with the error processing logic, by 
special hardware and software provided to create errors 
in the system. This is a form of fault injection, which 
can also be effected by using external mechanisms to 
simulate various faults. The efficiency of the fault 
detection circuits can also be evaluated using fault 
injection.
The outputs from each checker may be monitored continuous­
ly to observe all transient errors, but unwanted tran­
sition spikes will have to be eliminated from the system. 
In general, however, the checker outputs are sampled at 
regular defined points during system operation, for the 
detection of transient and permanent faults.
Mechanisms may also be provided within the system for 
marginal testing, so that degradations in operating 
conditions are detected. These mechanisms appear to be 
rare, however.
Fault diagnosis is assisted by the inclusion of isola­
ting circuits between a gate output (transmitter) and the
gate inputs (receivers) it drives. These are ideally 
integrated as discrete package or within the device they 
are isolating. Circuits adopted in the experimental
self checking computer of Chapter 9 are discussed in
-252-
Appendix B.
Various techniques for fault detection and fault diagnosis 
have been presented in this Chapter. It is important to 
examine the extent of their usage in current and past 
research, not only as a means of putting them into
perspective, but also to determine how far the aims of 
this investigation have already been met. A number of 
self checking devices and systems are therefore reviewed
in Chapter 8 for this purpose. Many of the techniques for 
fault detection and fault diagnosis, presented in this
Chapter, are practically applied in the experimental self 
checking computer of Chapter 9.
-253-
7.10 : REFERENCES
7.1) Designing reliable computer systems; the fault
tolerant approach - 1 - R. G. Bennetts; IEE Elec­
tronics and Power; November/December, 1978; pp. 
846-851.
7.2) Designing reliable computer systems; the fault
tolerant approach - 2 - R. G. Bennetts; IEE Elec­
tronics and Power; January 1979; pp. 51-56.
7.3) Fault tolerance and digital systems - R. G. Ben­
netts; Microprocessors and Microsystems; Vol. 3, 
No. 8; October, 1979; pp. 365-373.
7.4) Fault tolerant computing: Techniques and develop­
ment - A. Avizienis; Infotech*; pp. 309-333.
7.5) Maintenance techniques of a microprogrammed self-
checking control complex of an electronic switching
system - H. Y. P. Chang, G. W. Heimbigner, D. J.
Senese, T. L. Smith; IEEE Trans. Comput., Vol. 
C-22, No. 5; May, 1973; pp. 501-512.
7.6) Fault-tolerance of a general purpose computer 
implemented by very large scale integration - R. M. 
Sedmak, H.L. Libergot; FTCS-8*; pp.137-143.
7.7) Availability, reliability and maintainability 
aspects of the Sperry Univac 1100/60 - L. A. Boone, 
H. L. Liebergot, R. M. Sedmak; FTCS-10*; pp. 3-8.
7.8) The STAR (self-testing and repairing) computer: An
investigation of the theory and practice of fault- 
tolerant computer design - A. Avizienis, G. C.
Gilley, F. P. Mathur, D. A. Rennels, J. A. Rohr, D.
K. Rubin; IEEE Trans. Comput.; Vol. C-20, No. 11; 
November, 1971; pp. 1312-1321.
7.9) Microprogrammed control and reliable design of 
small computers - G. D. Kraft, W. N. Toy; Prentice- 
Hall Inc., Englewood Cliffs, New Jersey, USA; 1981; 
Chapter 9, pp. 368-415.
7.10) Design of a microprogram control for a processor 
in an electronic switching system - T. F. Storey; 
Bell Syst. Tech. J.; February, 1976; pp. 183-232.
7.11) Design of a self-checking microprogram control - R. 
W. Cook, W. H. Sisson, T. F. Storey, W. N. Toy; 
IEEE Trans. Comput.; Vol. C-22, No. 3; March, 1973; 
pp. 255-262.
7.12) The design of a microprogrammed self-checking 
processor of an electronic switching system - H. Y. 
P. Chang, R. C. Dorr, D. J. Senese; IEEE Trans. 



















7 . 3 0 )
Architectural design for near - 100% fault coverage
- J. J. Stiffler; FTCS-6*; pp. 134-137.
Design techniques for reliable hardware - F. P. 
Maison; Infotech*; pp. 236-251.
Installability, availablity, reliability and 
maintainability aspects of the Sperry System 11
- A. K. Bhatt, D. R. Mueller; FTCS-14*; pp. 29-35.
Error detecting logic for digital computers - F. F. 
Sellers, M. Y. Hsiao, L. W. Bearnson; McGraw-Hill, 
New York; 1968.
Error detecting codes, self checking circuits and 
applications - J. F. Wakerly; Elsevier - North 
Holland; New York; 1978; Chapter 4, pp. 101-122.
Implementation techniques for self-verification
- R. M. Sedmak; 1980 Test Conf.*; pp. 267-278.
as per 7.9); Chapter 7, pp. 212-330. 
as per 7.17); Chapter 2, pp. 9-53.
Error-correcting codes and self-checking circuits
- D. K. Pradhan, J. J. Stiffler; Computer; March, 
1980; pp. 27-37.
A new class of error-correcting/detecting codes for 
fault-tolerant computer applications - D. K. Prad­
han; IEEE Trans. Comput.; Vol. C-29, No. 6; June, 
1980; pp. 471-481.
as per 7.17); Chapter 7, pp. 156-174.
Arithmetic algorithms for error-coded operands
- A. Avizienis; IEEE Trans. Comput.; Vol. C-22, No. 
16; June, 1973, pp. 567-572.
as per 7.17); Chapter 5, pp. 123-142.
Cost effectiveness of self checking computer 
design - W. C. Carter, G. R. Putzola, A. B. Wadia, 
W. G. Bouricius, D. C. Jessep, E. P. Hsieh, C. J. 
Tan; FTCS-7*; pp. 117-123.
Shift Registers designed for on-line fault detec­
tion - D. K. Pradhan, M. Y. Hsiao, A. M. Patel, S.
Y. Su.; FTCS-8*; pp. 173-178.
as per 7.17); Chapter 3, pp. 54-99.
as per 7.17); Chapter 6, pp. 143-155.
HP 547A Digital Current Tracer; Hewlett Packard 
1986 Product Catalogue; Hewlett Packard,USA;p. 378.
-255-
7.31) A research oriented microcomputer with built in 
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
7.32) Measurement of fault detection mechanisms efficien­
cy : Results - Y. Crouzet, B. Decouty; FTCS-12*; 
pp. 373-376.
7.33) as per 7.9); Chapter 8, pp. 331-367.
7.34) Computer systems reliability: An overview - A.
Avizienis; Infotech*; 1975; pp. 216-233.
7.35) Fault diagnosis in computer control systems - G. 
Edge; Systems Technology; No. 31; April, 1979; pp. 
33-41.
* see section B.5.
-256-
Code Single bit odd parity
Data Bits n
Check Bits 1
Fault Security Single bit-slice faults
Self Test Faults affecting fewer than all bits
Checker p-bit EXCLUSIVE OR tree and 
q-bit EXCLUSIVE OR tree 
(p+q=n+l, where p and q >/ 1)




Fault Security Fault affecting different bits in the 
two circuits
Self Test Fault affecting different bits in the 
two circuits
Checker n-bit TSC equality checker
Comments Most redundancy, expensive checker, not 
self testing for many double faults
Code b-adjacent (odd parity)
Data Bits k x b-bit bytes
Check Bits 1 x b-bit byte
Fault Security Single byte-slice faults
Self Test Faults affecting fewer than all bytes
Checker b x p-bit EXCLUSIVE OR trees, 
b X q-bit EXCLUSIVE OR trees and 
b-bit TSC Morphic AND gate 
(p+q=k+l, where p and q >/ 1)
Comments Cheapest checker out of all codes with 
b check bits
Low cost residue, A = 2 -1 
k x b-bit bytes 
1 x b-bit byte
Single byte slice faults except all 
bits stuck-at-1 or stuck-at-0 
Faults affecting fewer than all bits
k-byte tree of b-bit modulo 2^-1 
adders and TSC equality checker 
Direct implementation of arithmetic 
operations, slower and more expensive 
checker compared with b-adjacent 
code checker
FIGURE 7.1 PROPERTIES OF FOUR SEPARABLE CODES
Code : 
Data Bits : 
Check Bits : 
Fault Security :







BlT| BITjBIT WORD 2











p= parity bit 
n data bits 











FIGURE 7.3 MEMORY ADDRESS DECODING WITH REPLICATION
-258-
4-BIT BYTE MEMORY CHIPS 





D D D I 
15 1+ 13n
k= 4:DATA BYTES 
1:CHECK BYTE





D D D I 
11 10 9 8
DATA
BITS
) D D D  




D D D D 
3 2 1 0
CHECK 
BITS _
P P P F 

































CHECK ^ MEMORY A-








































































—  PATH OF DATA FOR MEMORY READ
FIGURE 7.6 CODE TRANSLATION FOR UNIDIRECTIONAL BUSES 



























BIDIRECTIONAL BUS BIDIRECTIONAL BUS
FIGURE 7.7 CODE TRANSLATION FOR BIDIRECTIONAL BUSES 
ENCODED ADDRESS INFORMATION (PARITY) ENCODED CONTROL INFORMATION
(m-out-of-n)
INTERLEAVED CODES FOR TRANSMISSION
FIGURE 7.8 INTERLEAVED CODES
-260-





t2 -4- < ] r Q
ENCODED CONTROL BUS 
M' b ^  1T ■*"" A 1
"^ al




tk -4 C 11 V— I ak'













O H 3 r i
t2 -4-
a2 X't
tk-‘— O ^ 0 eak
V X*_A'
BREAK DETECTED 
BY DATA PATH 
CODE
BREAK DETECTED 
BY CONTROL PATH 
CODE
















al ■ ■ Y D -
yi
























































4 BRANCH ALLOWED CHECK BIT 
1= ALLOWED BRANCH 
0= INVALID BRANCH













FIGURE 7.17 DIFFERENT PARITY FOR INSTRUCTION AND DATA WORDS
SELF TESTING 












FIGURE 7.18 TWO RAIL (l-OUT-OF-2) TO SINGLE RAIL CONVERSION






GATE 0 GATE 0INPUTSTEST 
NORMAL =0 
TEST =1















































Bl 0/P X X X
B2 I/P X X X
B2 0/P X
B3 I/P X X X
B3 0/P X
*> * CODED BUS 
B1-B3 - BUFFER PACKAGES 
C1-C3 - TSC CHECKERS 









Cl C2 Cl C2
Bl 0/P X X X X
B2 0/P X X X X
B3 I/P >: X X
U3 0/P x' ‘ X
* - CODED BUS 
Bl,B2 - TRI-STATE BUFFER PACKAGES 
B3 ■ BUFFER PACKAGE 
Cl,C2 - TSC CHECKERS 
FAULT - SINGLE FAULT IF CODE IS 
PARITY
(b)





















* - CODED BUS 
B1-B3 - BUFFER PACKAGES 
C1-C3 - TSC CHECKERS
11,12 - ISOLATOR PACKAGES 
FAULT - SINGLE FAULT IF CODE IS PARITY 
ISOLATOR FAULT - AS ABOVE, BUT NEVER AN INPUT 









Bl TX TO B3 B2 TX TO B3
Cl C2 C3 C4 Cl C2 C3 C4
Bl O/P X X X X
11' X X X X
B2 O/P X X X X
‘ :: X X X X
B3 I/P X X X >:
B3 0/P X X
* - CODED BUS 
B1.B2 - TRI-STATE BUFFER PACKAGES 
B3 ■ BUFFER PACKAGE 
C1-C4 ■ TSC CHECKERS
11,12 - ISOLATOR PACKAGES 
FAULT - SINCLE FAULT IF CODE IS PARITY 
ISOLATOR FAULT - AS ABOVE, BUT NEVER AN OUTPUT 
STUCK-AT FAULT OR INPUT TO 
OUTPUT SHORT CIRCUIT
(b)




































CHECKER(S ) DETECTING FAULT
Bl TX TO B2 Bl TX TO B3 B2 TX TO Bl B3 TX TO Bl
Cl C2 C3 C4 C5 C6 Cl C2 C3 C4 C5 C6 Cl C2 C3 C4 C5 C6 Cl C2 C3 C4 C5 C6
Bla X X X X X X X X X X
Bib X X X X X X X X X X
lib X X X X ■V
B2a X X X X X
B2b X X X X X
_.I2b X X X X X
_ B3a X X X X X








TRI-STATE BIDIRECTIONAL BUFFER PACKAGES 
TSC CHECKERS
BIDIRECTIONAL ISOLATOR PACKAGES 
SINGLE FAULT IF CODE IS PARITY
AS ABOVE, BUT NEVER A STUCK-AT FAULT ON THE MAIN BUS OR 











FIGURE 7.26 BIDIRECTIONAL BUS STRUCTURE AND ISOLATOR 
l----
A
INPUTS (^ 2 CHANGE FROM 
<11> to<P0>via<(01>
TRANSITION
FIGURE 7.27 OUTPUT TIMING OF DECODER IN FIG 6.33
-268-
CHAPTER EIGHT : SELF CHECKING SYSTEMS AND
DEVICES : A REVIEW
8.1 : INTRODUCTION
One of the reasons that self checking systems are not 
often designed using commercially available integrated 
circuits is the high cost involved to do so. Many SSI and 
MSI devices are required to implement such a system, as 
the required functions are not available in LSI or VLSI 
form. Possible solutions to this problem are to develop 
circuits incorporating self checking (or fault tolerant) 
mechanisms specfic to the function performed by that 
circuit, or to develop specific self checking circuits 
containing a set of mechanisms which allow self checking 
to be incorporated in a system using traditional circuits.
Some circuits based on the first category are already 
available. The Motorola 68000 and the Intel iAPX 432 are 
two examples which have been already discussed in section 
4.9. The most widespread fault tolerant device is an 
interface circuit for Hamming single error correcting/ 
double error detecting (SEC/DED) coded memory, which is 
available from several manufacturers.
If self checking techniques are taken into account during 
the design of LSI/VLSI integrated circuits, this reduces 
the hardware increase for the introduction of self check­
ing at system level. It also enables designers who are 
not familiar with self checking techniques to incorporate 
them into a system.
A number of self checking devices have been proposed for 
use in self checking systems. However, in general, self 
checking versions of existing integrated circuits have not 
been investigated. So, before reviewing a number of 
proposed self checking devices, section 8.2 examines the 
modifications required to convert an existing integrated 
circuit into.a self checking form. Sections 8.3.1 to
-269-
8.3.8 then briefly review some proposed self checking 
devices. Devices from both categories mentioned above are 
included. Following this, sections 8.3.9 to 8.3.11 detail 
various proposed self checking systems.
8.2 : SELF CHECKING INTEGRATED CIRCUITS
If a totally self checking system is to be constructed out 
of integrated circuits (ICs) which have a built in self 
checking ability, then all ICs must be available in a self 
checking form. There must therefore be self checking NAND 
gate packages (SSI) through to self checking microprocess­
ors (VLSI). The problem is to select a self checking 
technique which can be adapted for any application that 
the IC might be used for.
If, for example, the outputs of a NAND gate package were 
parity encoded, a parity checker could monitor the normal 
gate outputs, as well as the additional pin required for 
parity, to constantly check the output code for failures, 
as shown in Fig. 8.1. However, - as previously discussed 
for check bit prediction in section 7.5, any failure 
producing an erroneous gate output within the IC will not 
be detected. This is because the parity generation is 
based on gate outputs. The only solution is to implement 
the complete parity prediction technique by encoding the 
gate inputs as well. This approach is certainly appropri­
ate for VLSI devices, such as microprocessors, which have 
a bus architecture. It is also the approach adopted by 
Sedmak for his generalised self checking VLSI chip, 
described in section 8.3.7.
Alternatively, and perhaps more appropriate for SSI
packages, is a duplication and comparison of internal
circuitry in normal or complementary form (see section
8.3.7). It is more appropriate, because each gate is
used, in general, for a completely independent purpose.
The NAND package shown in Fig. 8.1 now becomes that shown
in Fig. 8.2. Two additional pins are required for each IC ✓
-270-
to output the l-out-of-2 encoded error signal pair. 
Signal pairs from each IC could drive individual fault 
indicators (via latches), be merged in groups to drive 
fewer indicators, or be merged to give a single fault 
indication. The merging process would use n-input Morphic 
AND gates, then available in IC form (see section 8.3.8). 
However, inputs faults, such as stuck at lines due to 
short or open circuits, right up to the input line fan-out 
points within the IC (see Fig. 8.2) would not be detected. 
Erroneous input signals would be fed to both copies of the 
circuit. This problem would be minimised if the fan-out 
point of each input was at its IC pin. Inputs could be 
encoded and checked at this pin, if this was desired, but 
error signals from each IC would be a true indication of 
the operational state of the IC itself. An additional 
problem to be avoided would be an internal fault which 
caused the error signals to become stuck at <01> or <10>.
8.3 : PROPOSED SELF CHECKING SYSTEMS AND DEVICES
8.3.1 : 68000 Microprocessor
The Motorola 68000 microprocessor has already been dis­
cussed in section 4.9.2 from the view point of its built 
in test capabilities. Nicolaidis has evaluated a self 
checking design of this processor C8.1D. He pays
particular attention to the designs of other researchers 
for a self checking sequencer, indicating that in each 
case there are some real failures within the circuit which 
are not detected. In suggesting a satisfactory sequencer 
he discusses aspects of self checking ROMs, PLAs and 
arithmetic units.
8.3.2 : MIL-STD-1750A Microprocessor
One of the designs studied by Nicolaidis is the VLSI self 
checking MIL-STD-1750A microprocessor proposed by Halbert 
and Bose C8.2D. They make extensive use of PLAs to form 
TSC building blocks, which are then interconnected to form
larger TSC circuits. Using this means they discuss two 
major microprocessor subsystems, the arithmetic unit and 
the microprogram controller. In both cases the objective 
is 100% fault coverage.
8.3.3 : Microprogram Control Unit
Wong et al have designed a self checking microprogram 
control unit C8.33 based on the Advanced Micro Devices 
AM2910 microprogram controller C8.43. The effect of 
physical failures on the outputs of each functional block 
are checked, rather than each individual physical failure. 
Fault assumptions are arbitrary failures in functional 
units and unidirectional errors on control or data trans­
fer buses and stored words. On this basis, Berger, 
modified Berger codes and duplication codes are used. The 
modified Berger code requires fewer check bits than a 
normal Berger code. It is used to check n-bit data words 
in which the number of ones is known to be always much 
less than n (up to 3 in this instance). The unit uses 
only three totally self checking checkers, all of which 
are equality checkers.
8.3.4 : Four-Bit Microprocessor
Crouzet et al C8.5,8.63 consider the attributes of five 
error detection codes; duplication, k-out-of-2k, Berger, 
b-adjacent and residue. Using the constraints of MOS 
circuit design, they detail the coding and checking cir­
cuits required for each code in terms of the number of 
gates (power levels), the number of transistors (size) and 
the number of gate levels (time delay). This forms a basis 
for code comparison. They then apply parity (single error 
detection), duplication (multiple error detection) and k- 
out-of-2k encodings to a specific 4-bit microprocessor. 
A graphical representation of the processor unit is used 
to apply each of the three coding techniques in turn, with 
particular attention to the positioning of the checkers. 
They conclude that the areas of the processor which employ
-272-
duplication require an increase in hardware which is of 
the same order as the increase required by either of the 
other two. Overall duplication is favoured because it is 
easier to design and will detect a large set of faults.
8.3.5 : Error Detection Processor
Chavade et al C8.7,8.8D propose a self checking error 
detection processor for use with commercially available 
integrated circuits in the design of a self checking 
system, as depicted in Fig. 8.3. The facilities of the 
processor include:
1) Data comparison.
2) Code translation and checking (translation between 
parity, 4-adjacent or SEC/DED Hamming codes and 
duplication, or vice versa).
3) Watchdog timer (hardware/software errors).
4) Failure management (error data logging and recovery).
It is able to cope with duplicate synchronous or asynchro­
nous processors and duplicate or encoded blocks devoted to 
data transfer or storage. It is programmable to select 
the operating mode, cascadable from the point of view of 
fault signals and is ideally housed within a 40-pin 
package.
8.3.6 : The PAD
The PAD is a development of the error detection processor 
by the same team C8.93. It is again a self checking LSI 
device for fault detection in a microcomputer and is used 
with standard integrated circuits to form a self checking 
system, as shown in Fig. 8.3.
The system must have twin CPUs but the memory and I/O 
blocks can be either duplicated or encoded. For
duplicated parts the PAD provides:
-273-
1) Comparison.
2) Data exchange between CPUs.
3) CPU sychronisation.
For non duplicated parts it provides:
1) Generation and checking of three codes (single parity, 
4-adjacent and SEC/DED Hamming codes).
2) Control of non duplicated peripherals.
In addition to these architecturally dependent functions, 
the PAD also caters for the following:
1) A watch dog timer check.
2) Detection of memory protection violations.
3) Error management (transient error recovery).
The device allows CPU synchronisation to the nearest clock 
cycle (micro synchronised), or CPU synchronisation at 
program segment or task level (macro synchronised). All 
the PAD fault detection mechanisms are implemented in self 
checking logic. The phases adopted for designing the self 
checking logic are described, along with the influence of 
its integration.
8.3.7 : Generalised VLSI Chip
Sedmak has proposed a generalised self checking VLSI chip, 
shown in Fig. 8.4, which has the following features 
C8.10-8.123:
1) Functional logic which is duplicated in complementary 
form.
2) N comparators to check the two functional block 
outputs as well as intermediate results.
3) Redundant power inputs checked by comparison.
4) Output data and control information encoded in parity 
or an error correcting code.
5) M input data and control line code checkers.
-274-
6) Error encoding and multiplexing logic to generate 
encoded output error signals from the internal check­
er outputs.
7) 4-phase input clock check.
Complementary logic is used for the duplicated circuitry, 
because it eliminates the occurrence of identical mask, 
cell or bridging faults in the two circuits. The signals 
in and out of the complementary circuit are opposite to 
those of its duplicate. The same physical devices are 
used for sequential logic, but control signals and stored 
data are opposite in polarity to those in the duplicate 
circuit. An example of a circuit and its complement is 
given in Fig. 8.5. Sedmak also proposes an error handling 
chip to process the outputs from each individual self 
checking VLSI chip. This chip, shown in Fig. 8.6, merges 
and sorts the information to isolate a fault.
Overall this scheme results in:
1) Immediate detection of all single, most multiple and 
most bridging faults.
2) Immediate detection of power and clock failures.
3) An opportunity to recover from these failures.
4) Automatic isolation of the failed chip or interchip 
connections.
5) A chip count increase of 5.5% over a conventionally 
checked VLSI design.
8.3.8 : Morphic AND Gate
In reference C8.133 Wakerley proposes a 4-input Morhic AND 
gate with polarity control to allow input encodings in 
either the duplication or l-out-of-2 codes. The circuit 
is to be housed in a 14-pin package containing 4 x 2-input 
gates, 16 x 4-input gates and two 8-input gates, as shown 
in Fig. 8.7. This is the design given by equation (6.9) 
and shown in Fig. 6.23b. Five of these chips would be 
required for a 16-bit equality checker. One advantage of
-275-
using SSI gates is that they can be organised to be well 
checked for single failures. The custom chip approach is 
quite vulnerable to failures in which a pair of lines on 
the chip become stuck at <10> or <01>. This can be avoid­
ed by using a separate custom chip to generate each output 
line of the pair (independent subcircuits), at the expense 
of increasing the number of chips required. However, 
assuming only unidirectional failures, the custom chip 
approach is quite adequate.
8.3.9 : VLSI Building Blocks
Rennels et al have presented the results of a study to 
establish a standard set of VLSI building blocks C8.143. 
These blocks are assembled with commercially available 
microprocessors and memory into fault tolerant distributed 
computer configurations. The resulting multi-computer 
architecture uses self checking computer modules. A 
redundant bus system is employed for communication pur­
poses between modules. The blocks use many of the tech­
niques discussed in this Chapter. They are designed to 
meet a number of important conditions as follows:
1) They must interface directly with a variety of 
commercial microprocessors and memory.
2) Existing bus and I/O standards must be closely 
followed.
3) The resulting system architecture must have a high 
fault coverage.
The self checking computer module consists of four 
building blocks.
1) An error detecting and correcting memory interface 
which provides Hamming correction to damaged data, 
replacement of a faulty bit with a spare, parity 
encoding and decoding of internal buses and detection 
of internal faults.
2) A Programmable Interface which can be programmed to
-276-
perforin either the function of a bus adapter or a bus 
controller.
3) The Core which detects CPU faults by synchronously 
comparing two CPUs, collects fault indications from 
itself and other building blocks, attempts a recovery 
from transient faults and disables the processor on 
detection of a permanent fault.
4) Digital I/O to provide eight functions, which include 
serial and parallel I/O with appropriate checking 
techniques.
8.3.10 : Research Microcomputer
In the real time reasearch oriented microcomputer with 
built in auto diagnostics, Moreira de Souza et al place a 
great emphasis on safety, by preventing the transmission 
of erroneous data to its outputs C8.153. Using commerci­
ally available components, they achieve this with the 
following:
1) Duplication and comparison of the CPU (Intel 8080).
2) Parity encoding of bit-sliced memory data and I/O 
data.
3) Duplication and comparison of control circuitry.
4) A fault tolerant clock C8.163 with inhibit switches.
5) Bus isolation (described in section 7.8.2).
6) Fault Analysis Module.
The comparators, parity checkers and fault analysis module 
are all implemented as self checking circuits. The 1-out- 
of-2 encoded error signals from the various checkers are 
processed by the fault analysis module, which indicates 
the failed module and inhibits the clock. A system with 
8 I/O ports, 4K words of RAM and 4K words of ROM requires
2.8 times the amount of hardware as a system without built 
in autodiagnostics, but this figure decreases with in­
creased memory and I/O. However, it is suggested that if 
special checking circuits were available, this figure 
could be reduced to a 1.15 increase.
-277-
8.3.11 : Self Checking Computer Costs
As a significant development from their early work, Carter 
et al have studied the cost effectiveness of a self
checking computer design C8.173. The design is complete­
ly self testing. It also has a retry capability without 
speed degradation and significant additional hardware. 
The processor unit consists of ten chip types. The
particular checking technique employed for each chip is 
described, along with the additional hardware overheads 
that this checking creates for each chip, in terms of the 
number of additional gates required within the chip, the 
number of additional chip pins required and the number of 
additional chips.
They also propose a check chip which can perform any one 
of the following functions:
1) Compute and check even byte parity on two bytes 
producing 2 parity signals and one checkable 1-out-of- 
2 pair.
2) Compute the parity of 27 lines producing a single 
output.
3) Act as a 12-input Morphic AND gate, or as a 11-input 
Morphic AND gate and check 4 odd parity lines, or
compare 2 pairs of lines, producing one checkable
l-out-of-2 pair.
Thirteen of these check chips are used within the self 
checking computer. Overall, the additional hardware 
required for checking purposes is again dependent on the 
amount of memory, but for 8K words it is 38%. The system 
has byte-sliced storage and is designed to detect all 
single faults. This results in considerable multiple 
fault coverage, typically 64-80%.
8.4 : CONCLUSIONS
-278-
The self checking devices and systems reviewed above show 
that duplication and comparison, along with coding, are 
the most widely used techniques for self checking pur­
poses. Programmable logic arrays (PLAs) are often used to 
implement the self checking mechansims.
Coding is principally applied to data transmission paths 
(control, address and data information) and memory. 
Duplication and comparison is employed for more complex 
circuitry, such as control and arithmetic units.
Parity continues to be the most predominant form of coding 
used, because of its minimum redundancy and because it 
requires a minimum amount of circuitry for generation and 
checking purposes. However, its error detection capabili­
ties are becoming a major limitation, particlarly in VLSI 
circuits, so more complex codes, such as Berger codes, are 
beginning to be adopted. Control information is often 
encoded in an m-out-of-n code.
Duplication and comparison is expensively employed because 
it is so straightforward to implement. The technique 
requires a minimum amount of design effort and will also 
detect a large set of faults. Even if a complete module is 
not duplicated, it is likely that the more complex parts 
of it will be, simply because there is no other viable 
technique. The duplicate circuitry can be constructed in 
normal or complementary form, depending on whether it is 
separate from, or integrated within the functional 
circuitry.
The main aim of this investigation is to develop a micro­
processor based system with fault diagnostics provided by 
self checking circuits. Existing research most relevant 
to this aim is the research oriented computer of Moreira 
de Souza et al C8.153. However, whilst discussing the 
various self checking techniques adopted in their system, 
they do not discuss fault detection, fault indication and 
error control logic in any detail. There is also no
-279-
comment on fault diagnosis for their system. In addition, 
the research computer uses unidirectional data buses, so 
there are no checking and isolation considerations for 
bidirectional buses.
All of the above aspects are detailed in the description 




8.1) Evaluation of a self-checking version of the 
MC68000 microprocessor - M. Nicolaidis; FTCS-15*; 
pp. 350-356.
8.2) Design approach for a VLSI self-checking MIL-STD- 
1750A microprocessor - M. P. Halbert, S. M. Bose; 
FTCS-14*; pp. 254-259.
8.3) The design of a microprogram control unit with 
concurrent error detection - C. Y. Wong, W. K. 
Fuchs, J. A. Abraham, E. S. Davidson; FTCS-13*; 
pp. 476-483.
8.4) Am2910A microprogram controller; Bipolar Micropro­
cessor Logic and Interface, 1985 Data Book; 
Advanced Micro Devices Inc.; Sunnyvale, California, 
USA; pp. 5/166-5/183.
8.5) Design of self-checking MOS-LSI circuits. Applica­
tion to a four-bit microprocessor - Y. Crouzet, C. 
Landrault; FTCS-9*; pp. 189-192.
8.6) Design of self-checking MOS-LSI circuits. Applica­
tion to a four-bit microprocessor - Y. Crouzet, C. 
Landrault; IEEE Trans. Comp.; Vol. C-29, No. 6; 
June, 1980; pp. 532-537.
8.7) A monolithic self-checking error detection process­
or - J. Chavade, M. Vergniault, P. Rousseau, Y. 
Crouzet, C. Landrault;1980 Test Conf.*;pp. 279-286.
8.8) Design specifications of a self-checking detection 
processor - Y. Crouzet, C. Landrault; FTCS-10*; 
pp. 275-277.
8.9) The PAD : A self-checking LSI circuit for fault- 
detection in microcomputers - J. Chavade, Y. 
Crouzet; FTCS-12*; pp. 55-62.
8.10) Fault-tolerance of a general purpose computer 
implemented by very large scale integration - R. 
M. Sedmak, H.L. Libergot; FTCS-8*; pp.137-143.
8.11) Implementation techniques for self-verification
- R. M. Sedmak; 1980 Test Conf.*; pp. 267-278.
8.12) Design for self-verification: An approach for deal­
ing with testability problems in VLSI-based designs
- R. M. Sedmak; 1979 Test Conf.*; pp. 112-120.
8.13) Error detecting codes, self checking circuits and
applications - J. F. Wakerly: Elsevier - North
Holland; New York; 1978; Chapter 8, pp. 175-206.
8.14) A study of standard building blocks for the design 
of f^ult-tolerant distributed systems - D. A.
-281-
Rennels, A. Avizienis, M. Ercegovac; FTCS-8*; pp. 
144-149.
8.15) A research oriented microcomputer with built in 
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
8.16) Fault-tolerant digital clocking system - J. Moreira 
de Souza, E. Peixoto Paz; Electronics Letters; Vol. 
11, No. 18; September 4, 1975; pp. 433-434.
8.17) Cost effectiveness of self checking computer 
design - W. C. Carter, G. R. Putzola, A. B. Wadia, 
W. G. Bouricius, D. C. Jessep, E. P. Hsieh, C. J. 
Tan; FTCS-7*; pp. 117-123.















f CHIP GO0D<01>or<10> 


































FIGURE 8.3 TYPICAL ARCHITECTURE OF A SYSTEM USING EITHER 
THE ERROR DETECTION PROCESSOR OR THE PAD











































(a) FUNCTIONAL CIRCUIT (b) COMPLEMENTARY CIRCUIT
FIGURE 8-5 COMPLEMENTARY LOGIC 
TO PROCESSOR CLOCK, INTERRUPT LOGIC AND CONSOLE OR TO OTHER ERROR
































ERROR SIGNALS OR OUTPUTS 
FROM OTHER ERROR HANDLING CHIPS











' ■ $ > -
















FICURE 8.7 4-INPUT TSC COMPARATOR WITH INPUT 
POLARITY CONTROL
-285-
CHAPTER NINE : AN EXPERIMENTAL SELF CHECKING COMPUTER
9.1 : INTRODUCTION
An experimental self checking computer, described in this 
Chapter, brings together a substantial amount of the 
theory, ideas and designs presented in Chapters 5 to 8. A 
microprocessor based system is converted from being total­
ly unchecked, to fully checked in certain specific areas. 
The two principal aims of this computer are to provide 
fault diagnosis to chip or circuit level, as appropriate, 
and to demonstrate the implementation of various self 
checking techniques.
A number of conditions are imposed on the unchecked and 
checked designs:
1) The unchecked microprocessor based system should be 
representative of typical designs and be implemented 
with widely used integrated circuits.
2) The use of complex circuitry should be avoided in the 
checked system.
The unchecked system is described in section 9.2. The 
checked system is not intended, at present, to be totally 
self checking, but section 9.3 discusses possible self 
checking techniques for every aspect of the unchecked 
system. Specific circuitry used in the checked system, is 
also detailed. The fault detection and error control 
circuits adopted in the experimental computer are des­
cribed in section 9.4, whilst section 9.5 investigates 
several practical implementations of TSC comparators. 
Section.9.6 considers the testing of the resulting system.
9.2 : THE UNCHECKED SYSTEM
The experimental computer is a minimal microprocessor 
based system using components from the Motorola 6800 
family C9.13* It has three modules; central processing
-286-
unit (CPU), memory and input/output (I/O). Each module is 
housed on a separate board, see Fig. 9.1.
The following letters are appended to each signal name, 
principally to identify the bus signals within each board. 
The address bus, A0-A15, is given as an example in each 
case.
C - CPU board 
M - memory board 






A description of each board in the unchecked system is 
given below.
The CPU board, illustrated in Fig. 9.2, consists of:
a) An MC6800 8-bit microprocessor C9.13.
b) An MC6875 2-phase clock generator C9.1D, with crystal 
and reset switch.
c) One octal non inverting transceiver package, to buffer 
the data bus to or from the backplane. Its direction 
is controlled by the CPU board read/write line (R/WC). 
It is enabled during the second clock phase (02 at 1).
d) Two octal non inverting line driver packages, to 
buffer the address bus to the backplane. They are 
enabled by the valid memory address (VMA) line.
e) One octal non inverting line driver package, to buffer 
the control bus to the backplane. It is permanently 
enabled.
The memory board, illustrated in Fig. 9.3, consists of:
a) One 2K (2048) words x 8-bit erasable programmable read 
only memory (EPROM). It is enabled by addresses in 
the range E000-E7F7^g and FFF8-FFFF^^. It cannot be 
be enabled by a write operation. The EPROM contains
the Motorola Minibug II monitor routines, with addi-
✓
-287-
tional commands to load, print and move memory
contents C9.23.
b) Two IK words x 4-bit random access memories (RAMs).
They are enabled by addresses in the range A000-
A3FF^g. A memory write operation can only take place 
during 02.
c) A 3-to-8 line decoder from address lines 13 to 15 (A13 
-A15). The decoder provides enables for memory and 
I/O devices. It is enabled by VMA. Additional gates 
provide specific enables for the I/O devices.
d) One octal non inverting transceiver package, to buffer 
the data bus to or from the backplane. Its direction 
is controlled by R/WM. It is enabled by addresses 
in the range AOOO-BFFF^ and EOOO-FFFF^, during 02.
e) Two octal non inverting line receiver packages, to
buffer the address bus from the backplane. They are 
permanently enabled.
f) One octal non inverting line receiver package, to
buffer the control bus from the backplane. It is
permanently enabled.
The I/O board, illustrated in Fig. 9.4, consists of:
a) An MC6850 Asynchronous Communications Interface Adap­
tor (ACIA) C9.1D with an RS232 compatible line driver 
and line receiver as a terminal interface. The ACIA 
is enabled by addresses 8008^ and 8009^.
b) An MC14411 Bit Rate Generator C9.13 and crystal. This 
provides the ACIA transmit and receive clocks, such 
that the ACIA operates at 300 BAUD.
c) An MC6821 Peripheral Interface Adaptor (PIA) C9.13. 
Peripheral data lines PB0-PB7 and peripheral control 
lines CA2 and CB2 are assigned as outputs. They each 
drive light emitting diodes (LEDs). Peripheral data 
lines PA0-PA7 are assigned as inputs. A set of 
switches provide a 0 or a 1 to each input. The PIA is 
enabled by addresses 8004-8007^.
d) One octal non inverting transceiver package, to buffer 
the data bus to or from the backplane. Its direction
is controlled by R/WI. It is enabled by addresses in 
the range 8000-9FFF^g, during 02.
e) Two octal non inverting line receiver packages,, to 
buffer the address bus from the backplane. They are 
permanently enabled.
f) One octal non inverting line receiver package, to 
buffer the control bus from the backplane. It is 
permanently enabled.
9.3 : THE CHECKED SYSTEM
This section discusses self checking techniques for all 
aspects of the unchecked system. In certain cases, the 
specific hardware used in the experimental computer is 
detailed.
9.3.1 : CPU
The CPU is duplicated and the outputs of the two process­
ors compared, as shown in Fig. 9.5. This is the most 
convenient technique to adopt. A self checking processor, 
as proposed in sections 8.3.1 and 8.3.2, could be 
implemented using discrete gates and devices, but this is 
not practical or relevant to this experimental computer.
The compared outputs of the dual CPUs are the address and 
data buses, plus control lines VMA, BA and R/W. The three 
comparators, shown in Fig. 9.5, are totally self checking 
(TSC). Their outputs can be merged into a single 1-out- 
of-2 error signal. The outputs of the functional CPU 
become the system signals. However, since the data bus is 
bidirectional, and therefore a set of inputs as well as 
outputs, information on this bus must be applied to both 
CPUs during a processor read operation (R/W=l). This is 
achieved with the tri-state unidirectional buffer shown in 
Fig. 9.5. It is enabled during a processor read
operation only.
No additional circuitry is required to synchronise the two
-289-
6800 CPUs. They have a common clock ((f)l and (J>2 generated 
from the MC6875) and during normal operation perform
identical operations from a reset condition.
The scheme of Fig. 9.5 could form the basis of a self
checking microprocesor, as described in section 8.2 for 
a NAND gate package. All inputs would have to fan out to 
both processors from their respective chip pins, in order 
to minimise internal faults affecting both processors in 
the same manner. This has also been discussed in section
8.2. In addition, all inputs to both processors ideally
need to be checked. The checking of clock, data bus, halt 
and reset inputs is considered in other sections. The 
rest of the inputs, NMI,IRQ, TSC and DBE, are not used and 
therefore connected to the appropriate logic level. These 
inputs are difficult to check, because they are static 
levels. They could individually, or as a group, be 
encoded with their encoding checked externally at the 
relevant chip pins. Alternatively, they could become part 
of the maintenance procedures for non self testing 
circuitry, as discussed in section 7.8.4. Mechanisms 
would be provided to exercise these normally static inputs 
and the resultant response of the processors checked. 
This process would be effected by a combination of hard­
ware and software.
Power line failures also need to be considered, both with­
in a chip and external to it. Short circuit decoupling 
capacitors, for example, cause a power line failure and 
are difficult to locate. Recently, however, fail safe 
capacitors have been introduced, which predominantly fail 
to an open circuit condition C9.31. Power rails are often 
monitored for failures by light emitting diodes. The 
generalised VLSI chip described in section 8.3.7 has 
redundant power inputs, which are fed separately to the 
functional and duplicate logic. They are checked intern­
ally by comparison. However, in general, the technique of 
feeding redundant inputs separately to each of the two 
circuit copies and the internal checking of external
-290-
inputs, unless carefully implemented, should be avoided. 
This is because failures external to the chip will then 
cause it to generate an error signal. Replacing the chip
under these circumstances will be futile, since there is
nothing wrong with the chip itself. In the generalised 
VLSI chip, it is assumed that the occurrence of input and 
power failures can be specifically identified from the
encoded error signals. Static inputs and power rails are
not checked in the experimental computer.
9.3.2 : System and CPU Clocks
The MC6875 clock generator provides the CPU clocks, MPU01 
and MPU02, plus the system clock, BUS02. The BUS02 clock 
signal is checked with the TSC periodic signal checker 
described in section 6.9.
Fig. 9.6 shows the output circuitry of the MC6875, from 
which it can be seen that MPU01 and MPU02 are the Q and Q 
outputs of a T-type flip-flop, whilst BUS02 is MPU01 
inverted. BUS02 is fully checked for period, mark-space 
and stuck-at faults. If this signal is correct, then 
MPU01 cannot have period or mark-space faults. The only 
fault which can affect it is a wire break between the 
fan-out point of Q (see Fig. 9.6) and its integrated 
circuit pin. This will give the appearance of a stuck-at 
fault to the checker. Similarly, if BUS02 is correct, 
then it is unlikely that MPU02 will be a signal of 
different period or mark-space ratio, since it is gener­
ated from the same flip-flop. Therefore, only stuck at 
faults are considered possible on MPU02.
If BUS02 is fully checked, it is thus unnecessary to fully 
check MPU01 and MPU02. These two signals only need to be 
checked for stuck-at faults. Since MPU01 and MPU02 
naturally form a l-out-2 encoded pair, this is achieved 
by connecting them directly to the fault indication 
circuitry. Fig. 9.6 shows the two checking mechanisms 
employed. If a fault in the output flip-flop of the
-291-
MC6875 causes <MPU01,MPU02> to become stuck-at <01> or 
<10>, this not a problem, because the fault will be detec­
ted by the TSC periodic signal checker on BUS02.
9.3.3 : Reset Line
Part of the MC6875 clock chip is used to provide power-on 
and manual reset facilities, as shown in Fig. 9.7. The 
reset line, RSTC on the CPU board, is normally static at 
1. However, during a manual reset it is at 0. A simple 
fault detection mechanism is therefore a light emitting 
diode (LED), driven form the reset line via a buffer. 
This is illustrated in Fig. 9.7. The power-on reset might 
not be observed, but a manual reset will cause the LED to 
be lit for as long as the reset switch is depressed. If 
the t.f d does not indicate during this operation, or is 
permanently lit, the source of the failure can be easily 
traced to the LED, its driver, the reset line, the MC6875 
or the reset switch. This circuitry is therefore self 
testing, assuming that the reset switch is depressed at 
some point during normal operation.
9.3.4 : Data Transmission Paths
Single bit odd parity is used for all data transmission in 
the experimental computer. It is chosen for the following 
reasons:
1) Minimum generation circuitry - a standard MSI in­
tegrated circuit.
2) Minimum checking circuitry - totally self checking 
EXOR trees.
3) Minimum redundancy.
4) Adequate error detection capability.
5) Odd parity is maintained by high impendance buses.
The last two reasons are expanded in subsequent para­
graphs .
-292-
All the line driver, line receiver and transceiver 
integrated circuit packages used in the experimental 
computer for data transmission are eight bits wide; i.e. 
they contain eight of each particular device. A parity 
bit is therefore added to each group of eight, or up to 
eight lines. The data bus, address bus and a number of 
control lines grouped together as a control bus, are all 
parity encoded as follows:
1) Data Bus : 1 parity bit, DP, covering data lines D0- 
D7.
2) Address Bus : 2 parity bits; API covering address 
lines A0-A7 and AP2 covering address lines A8-A15.
3) Control bus : 1 parity bit, CP, covering control lines 
02, VMA, R/W and RST.
The parity bit for each group of eight, or up to eight 
signals is processed by an independent buffer, as indica­
ted in Fig. 9.8.
Single bit parity will detect all single bit faults in a 
word. It will also detect all faults affecting an odd 
number of bits. A non transforming circuit for parity 
encoded data, such as Fig. 9.8, is self testing for all 
faults affecting less than all bits. This means that an 
enable or power line failure, which causes an all 1 or an 
all 0 output from the parity bit buffer or the data bit 
buffers, will eventually be detected. If tri-state 
buffers are used in the circuit of Fig. 9.8, then tie-up 
resistors will maintain a correct odd parity word (nine 
l's) on the bus when the buffers are in their high 
impedance state. A similar situation occurs if a failure 
in the direction line of a bidirectional buffer package 
causes two sets of buffers to drive a bus.
The parity generation and checking used for each board is 
now considered.
CPU Board : Four parity bit generators are required, as
-293-
indicated in Fig. 9.9. These are all 74LS280 9-bit odd/ 
even parity generators/checkers C9.4D. Input data to each 
generator must be checked, because the generator is a non 
code disjoint circuit; it will produce a correct parity 
bit for erroneous input data. The address bus, data bus, 
R/RC and VMAC are all checked by CPU comparison, whilst 
02C and RSTC are checked by the mechanisms described in 
sections 9.3.2 and 9.3.3 respectively.
The parity generator for the data bus also requires a tri­
state buffer at its output, as shown in Fig. 9.9. This is 
enabled for a processor write operation only (R/WC=0). 
During a processor read operation, data bus parity is 
generated, or supplied by the transmitting device.
All the parity generation circuits described above could 
be included in the scheme of Fig. 9.5 for a self checking 
microprocessor. The chip would then not only indicate an 
internal failure, but have parity encoded outputs as well.
The three buses are checked on the CPU board with TSC EXOR 
trees, as described in section 6.7.
I/O Board : Both the ACIA and PIA require parity genera­
tion for the data bus during a processor read operation. 
This is achieved with a 74LS280 parity generator and a 
tri-state buffer, as shown in Fig. 9.10. The buffer is 
enabled for a processor read operation from that particu­
lar I/O device only, since other devices will supply 
parity when they are read and the CPU board supplies 
parity during a processor write operation. Parity is 
checked on all three buses with TSC parity checkers.
Fig. 9.10 assumes that each I/O device is isolated from 
others on the I/O board, so that the scheme presented is 
repeated for every I/O device. This is true if the 
isolating circuits are fully implemented, as suggested in 
section 9.2.8. The I/O board then becomes two sub- 
modules, one for the ACIA and one for the PIA. An
-294-
alternative structure is shown in Fig. 9.11, where the 
ACIA and PIA share a single data bus parity generator. 
The tri-state buffer connected to the output of the parity 
generator, is now enabled when the processor reads either 
the ACIA or the PIA; i.e. an I/O board read.
Memory Board : The memory board data, address and control 
buses are all checked using TSC parity checkers. The use 
of parity within the memory itself, is considered in the 
next section.
Parity is also checked on all three buses of the back­
plane.
9.3.5 : Memory
The EPROM used is 8-bits wide, whilst the RAM chips are 4- 
bits wide. A single parity bit appended to each data word 
is therefore inadequate to check many internal decoder 
(addressing) failures. Three possible solutions are:
1) Use an 8-bit byte error detecting code for the EPROM 
and a 4-bit byte error detecting code for the RAM.
2) Replace both types of memory with equivalent bit- 
sliced devices.
3) Store the address parity bits with each data word.
The first solution is not adopted because of the added 
complexity of circuitry required; a) for checking the 
memory codes and b) for code translation between bus and 
memory codes (see section 7.3.3) The additional redundant 
memory required for this solution is, however, not a 
problem. The second solution, whilst being suitable for 
microprogrammed memory, is not so for conventional micro­
processor based systems. It is rare, for example, to find 
bit-sliced EPROMs employed in such a system.
The third solution is the one adopted in the experimental 
computer, principally because the required parity bits are
-295-
already available. However, limitations of this technique 
using parity have already been considered in section
7.3.2.
A second 2K words x 8-bit EPROM is added to the system to 
hold the necessary check bits. This is referred to as the 
check EPROM, whilst the existing EPROM is referred to as 
the data EPROM. The control lines and address lines to 
this check EPROM are identical to those used for the data 
EPROM. Data and address parity bits are generated from 
the data stored in the data EPROM and the address used to 
access this data. This information is then loaded into 
the check EPROM, before it is installed into the system. 
Additional hardware is required to process its parity bits 
and this is shown in Fig. 9.12.
The data line of the check EPROM used for data parity, 
DPME, is connected directly to the data parity line of the 
memory board, DPM. It is simply an additional line for 
the data bus and therefore treated as such. There is no 
parity generation required for * either type of memory. 
During a processor read from EPROM, the address parity 
bits in the check EPROM, APME1 and APME2, need to be 
compared with corresponding address parity bits generated 
on the CPU board. These are APM1 and APM2 respectively on 
the memory board. This is achieved with the 2-bit TSC 
equality checker shown in Fig. 9.12. Two tri-state 
buffers are also required and these are enabled during a 
processor read from EPROM, allowing APM1 and APM2 to be 
presented to the equality checker. There they are 
compared with APME1 and APME2 from the check EPROM. At all 
other times, outputs APME1 and APME2, as well as outputs 
from the two tri-state buffers, are in a high impedance 
state. Tie-up resistors on all these four outputs 
maintain identical inputs (l's) to the comparator. If 
this mechanism fails, or a mismatch of address parity bits 
occurs when the EPROMs are enabled, then the TSC equality 
checker will indicate an error. Note that during normal 
operation, both EPROMS are not enabled by a processor
-296-
write cycle.
A third IK words x 4-bit RAM is added to the system to 
hold the three parity bits. This is referred to as the 
check RAM, whilst the existing RAMs are referred to as the 
data RAMs. The control and address lines to the check RAM 
are identical to those used for the data RAMs. The 
additional hardware required to process the parity bits is 
shown in Fig. 9.13. As for the check EPROM, the check RAM 
data parity line, DPMR, is connected directly to the 
parity line of the memory board, DPM. Two tri-state 
buffers are connected between the address parity bits of 
the memory board, APM1 and APM2, and the corresponding 
check RAM address parity bits, APMR1 and AMPR2. A TSC 
equality checker compares the inputs and outputs of these 
two buffers, as shown in Fig. 9.13. During a proccessor 
read operation from RAM, the buffer outputs are disabled, 
so the equality checker compares APM1 and APM2 with APMR1 
and APMR2 respectively. At all other times, including a 
processor write to RAM, the buffers are enabled, so the 
equality checker compares ' identical signals. If these 
buffers fail, or a mismatch of parity address bits occurs 
when the RAM contents are read, then the TSC equality 
checker indicates an error.
If the isolation circuits are fully implemented within the 
system, both RAM and EPROM will become sub-modules of the 
memory board, as indicated in the previous section for I/O 
devices.
9.3.6 : Address Decoding
The 3-to-8 line decoder used for address decoding produces 
a 1-out-of-n encoded output when enabled. However, as 
indicated in section 6.8, if the enable is also included 
as part of the output, then a 1-out-of-n code output is 
always produced. Note that the 1-out-of-n code is active 
low in this instance, i.e. a single 0 in n bits.
-297-
The effects of single failures within this type of decoder 
have already been considered” in Fig. 6.36. On this basis, 
the 1-out-of-n code checker described in example 6.7 is 
adequate for checking the decoder outputs. The design of 
example 6.7 is expanded in Fig. 9.14 to check a l-out-of-9 
code. No modifications are required to check an active 
low 1-out-of-n code. The decoder and its checker could 
form the basis of a self checking decoder integrated 
circuit.
The additional gates used to generate the ACIA and PIA 
enables, ENA and ENP repectively, are best checked by 
duplication and comparison, as shown in Fig. 9.14. The 
duplicate circuit can be constructed in normal or comple­
mentary logic. Alternatively, these gates can be replaced 
with a 2-to-4 line decoder, which has AM2 and AM3 as 
inputs and EN8/9 as an enable. This decoder can then be 
checked using the l-out-of-5 checker, described in example 
6.7. This scheme uses less packages than the original
gates with duplication and comparison. Only the main
decoder is checked in the final form of the experimental 
computer.
9.3.7 s I/O Devices
The ACIA again uses the duplication and' comparison tech­
nique employed for the CPU. The circuit for the ACIA is 
given in Fig. 9.15. A tri-state unidirectional buffer is 
also required, enabled this time by a processor write
operation to the ACIA (R/WI=0 and ACIA enabled). Both
devices then receive identical data information. Only the
functional ACIA drives the data bus during a read opera­
tion. Outputs compared with TSC equality checkers are the 
data bus (D0-D7), plus the RTS, lR$ and Tx Data signals.
The MC14411 bit rate generator has sixteen different
output clock rates. One of them is connected to both the
transmit and receive clock inputs of the ACIA (Tx Clk and
Rx Clk respectively). This clock signal is checked by the
TSC periodic signal checker described in section 6.9, as 
shown in Fig. 9.16. The monostable periods of the checker 
are adjusted to match the selected clock signal.
Input and output circuitry for the ACIA, which in this 
case is an RS232 compatible receiver and driver respec­
tively, can be duplicated and compared if this is desired, 
see Fig. 9.17a. An alternative arrangement is shown in 
Fig. 9.17b, which assumes an access to, at least, the 
transmit data output and receive data input of both ACIAs. 
This is possible in the experimental computer, but not if 
the ACIA is constructed as indicated in Fig. 9.15. For a 
given fault, Fig. 9.17a would also, in general, produce a 
a different set of fault indications to Fig. 9.17b.
9.3.8 : Control Logic
The self checking computer has various individual gates, 
or combination of gates, which process at least two 
control signals (this includes enables), to produce 
further control signals. In addition, a number of control 
bus signals are inverted on each board to match the 
required input levels of specific devices. Fig. 9.18 
gives some examples of this control logic from the memory 
board.
All control logic must be checked in a self checking 
computer . In the experimental computer, all inputs to 
this logic are taken from the control bus and/or the 
address decoder outputs. Both the control bus and the 
address decoder outputs are checked (parity and 1-out-of-n 
respectively), so whilst the inputs to the control logic 
are not, in general, encoded, they are checked. If this 
circuitry uses gates from self checking logic packages, as 
proposed in section 8.2, then it is inherently checked. 
Alternatively, each circuit must be duplicated and 
compared, as suggested for part of the address decoding 
logic in section 9.3.6.
-299-
9.3.9 : Isolators
From section 7.8.2, isolation circuits should be installed 
as follows:
1) A unidirectional isolation circuit in series with
every receiver, in a single transmitter to multiple
receiver structure - Fig. 9.19a.
2) A unidirectional isolation circuit in series with
every transmitter, in a multiple transmitter to single 
receiver structure - Fig. 9.19b.
3) A bidirectional isolation circuit in series with every 
slave transceiver 'A' bus connection, in a master 
transceiver to multiple slave transceiver structure
- Fig. 9.19c.
In all cases, multiple must also include single; i.e. 
single transmitter to single receiver and single trans­
ceiver to single transceiver structures. In general, 
then, there must ideally be an isolating circuit between 
every transmitter (gate output) -and receiver (gate input). 
On this basis, the experimental computer would require 
many unidirectional and bidirectional isolating circuits. 
This is clearly impractical, so a limited amount of these 
circuits are included for evaluation.
In the experimental computer, isolating circuits are 
placed between the backplane and the memory and I/O 
boards, as shown in Fig. 9.20. Unidirectional isolators 
are used for the control and address buses, with bi­
directional isolators for the data bus. The circuits 
adopted are those shown in Figs. 7.25 & 7.26b and derived 
in Appendix B. Appendix B also presents a fault table for 
each isolator. The non inverting line receivers buffering 
the control and address buses from the backplane are 
replaced with inverting types, since the unidirectional 
isolators are inverting. The data bus requires an 
additional TSC parity checker. This is between its 
isolating circuits and the transceiver which buffers it to
-300-
or from the backplane, see Fig. 9.20. The bidirectional 
isolators have a common control line, the base drive for 
each transistor. Thus, in a similar manner to the trans­
ceivers, independent drive buffers are provided for the 
data bit isolators and the parity bit isolator, as shown 
in Fig 9.20.
The fault table for the bidirectional isolator in Appendix 
B demonstrates that both data paths must be used during 
normal operation to fully check each isolating circuit 
and its associated buffer. It also implies that there
must be at least two data paths to achieve this. The data 
paths in this instance are the CPU board to the memory
board and the CPU board to the I/O board.
9.4 : FAULT INDICATION AND ERROR CONTROL LOGIC
The requirements of the fault indication and error control
logic, for diagnostic purposes, are as follows:
1) An indication of the output - from every TSC checker 
at the falling edge of clock phase 02. The fault
detection circuits are therefore not monitored contin­
uously. The falling edge of 02 is chosen, because it 
terminates every processor cycle.
2) An ability to permanently indicate a fault until the 
fault indication logic is reset. Permanent faults 
will automatically do this, but a transient fault may 
occur during a single cycle only and never be 
observed.
3) An ability to halt the processor on detection of an 
error. This prevents an error propagating throughout 
the whole system, creating a misleading error indica­
tion.
4) A means of manually halting the processor. The fault 
indication logic can be observed without any system 
activity (see section 9.6).
5) A means of testing all non self testing parts.
-301-
These requirements are achieved with the circuitry des­
cribed below.
The l-out-of-2 encoded output pair from each TSC checker 
is processed in the manner of Fig. 9.21, where an EXNOR 
gate merges the output pair into a single line. An 
additional EXOR gate then provides a means of testing this 
and subsequent non self testing circuitry, as previously 
discussed in section 7.8.1. A switch and buffered control 
line are required to provide this test function. The 
output from this EXOR/EXNOR arrangement feeds a J-K flip- 
flop, which can be configured as a D-type flip-flop for 
cyclic fault indication, or configured as a clocked S-R 
latch for permanent fault indication. The configuration 
of the J-K flip-flop is controlled manually, by a switch. 
The flip-flop is clocked by 02 and reset by RST. Both of 
these system signals are already checked (see sections
9.3.2 and 9.3.3). A light emitting diode is used as the 
fault indicator, driven directly from the Q output of the 
flip-flop.
In addition to feeding its own error indication circuit, 
each checker output is also connected to an n-input 
Morphic AND gate, in the manner of Fig. 9.22. This 
converts all the l-out-of-2 encoded pairs to a single 
l-out-of-2 encoded pair. This final pair is then merged 
and latched, with the inclusion of a test gate, to produce 
a signal which can halt the processor, if this is desired. 
A switch shown in Fig. 9.22 controls this action. The 
processor may be halted at any time, via another switch. 
A light emitting diode indicates the state of the halt 
line. This combination of switch and LED also allows the 
halt line to be tested in a similar manner to the reset 
line. An additional LED could be connected to the the Bus 
Available (BA) line of the CPU, as this is directly affec­
ted by a halt operation. However, this is unnecessary, as 
a failure in either CPU will be detected by the comparison 
process.
-302-
9.5 : PRACTICAL IMPLEMENTATION OF TOTALLY
SELF CHECKING COMPARATORS
Section 6.6 has given three structures for an n-input 
Morphic AND gate. These are a 2-level AND/OR structure, 
cascaded 2-input Morphic AND gates and cascaded 2-input 
Morphic AND gates with level merging. The following types 
of integrated circuit (IC) are now considered to implement 
these structures.
1) Single gates.
2) 74LS51 AND-OR-INVERT gates C9.43.
3) 4-input Morphic AND gate (see section 8.3.8).
Fig. 9.23 compares the number of IC packages required for 
an 8-input Morphic AND gate, constructed with the 
appropriate type(s) of IC for each structure.
The 4-input Morphic AND gate would require the minimum
number of packages. Using the 74LS51 is the second best
solution, in terms of the number- of packages, but results
in six logic levels. The cascaded 2-input Morphic AND
gates with and without level merging both require the same 
amount of packages, in this example, but the former 
approach is achieved using less logic levels. The 2 level 
AND/OR solution is totally impractical. Aside from the 
128-input OR gates, it requires 256 packages. An
advantage of using SSI packages in TSC structures is 
that, in general, the design can be well checked for
single package failures. Failures can occur in the 74LS51 
and custom chips, which cause a pair of lines to become 
stuck at <01> or <10>.
9.6 : TESTING THE EXPERIMENTAL COMPUTER
Possible checking mechanisms for each part of the experi­
mental computer have been discussed in previous sections. 
Not all these proposals have been adopted in the practical 
system. Some sections have already detailed the actual
-303-
hardware added to the unchecked system. Fig. 9.24 shows 
the complete schematic for the self checking computer as 
implemented. Self checking techniques have been applied 
to specific areas of the system for evaluation, so the 
computer is not totally self checking. Exhaustively 
testing the complete system is therefore inappropriate. 
The areas which need to be tested are as follows:
1) System clock generation.
2) Reset (switch, logic and line).
3) Halt (switch and line).
4) Main address decoder.
5) Memory parity (EPROM and RAM).
6 ) Isolating circuits between backplane and memory and 
I/O boards.
Appendix B has demonstrated that fault diagnosis is 
assisted by fault indications when the system is inactive. 
The situation is effected by halting both CPUs. Halting 
a 6800 CPU causes its address bus, data bus and R/W line 
to be put into a high impedance state, whilst VMA is 
forced low. The system clock (02) remains active, as does 
the system reset line. Fig. 9.25 details the effects of 
halting the processor on the rest of the system. There 
are no active memory or I/O enables. Tie-up resistors 
force floating lines to a logic 1 state. As a result odd 
parity is maintained on the address and data buses. The 
system also enters a read mode (R/WC=1). Parity con­
tinues to be generated throughout the whole system for the 
control bus, and on the CPU board for the address bus.
Fig. 9.26 details the faults detected by the CPU, reset 
line, halt line, system clock and main decoder checking 
mechanisms. It also gives the checker which will detect 
the faults in each case. Various faults were applied 
to these areas to confirm the contents of Fig. 9.26. In 
general, these were stuck-at faults, created by replacing 
a line transmitter with a 0 or 1 source. In all cases, 
the error control circuitry is set to halt the processor
-304-
on detection of an error. Timing faults were created in 
the clock generation circuitry by adjusting the oscillator 
frequency and also replacing the 6875 clock generator with 
an external clock source. The software used for test 
purposes is either the normal command routines of Minibug 
II C9.23, or specially written routines. These programs 
enable specific devices to be activated when required, or 
specific data to be transmitted on the three buses.
Fig. 9.27 details the faults detected in the memory 
circuits and by which checkers. Again, various stuck-at
faults and words with incorrect parity confirm these 
results. A number of words with incorrect parity are 
stored in the EPROMs when they are programmed, for test 
purposes. RAM words with incorrect parity are obtained by 
reading memory which has not previously been written to, 
or storing words with incorrect parity.
Tests on the isolating circuits and their associated
buffers are based on the fault analysis tables given in
Appendix B. These fault tables-are modified for use with 
the self checking computer in Figs. 9.28 and 9.29. All 
the single faults in these tables were simulated in the 
isolating circuits as they were developed, but are 
repeated in the overall system to confirm the checker 
indications. Both fault tables detail the faults detected 
when the CPUs are halted, which, again, are confirmed by 
practical fault simulation. However, since a number of 
signals in the control bus are not permanently at 1 when 
the CPUs are halted, faults detected during this condition 
are detailed separately for this bus in Fig. 9.29.
Overall, the level of fault diagnosis in the experimental 
computer is dependent on the level of fault detection, 
which in turn is dependent on the type of checking 
mechanisms used and the number of checkers provided.
The duplication and comparison of the CPU will detect most 
faults (all except those which produce an identical change
-305-
in the outputs of both CPUs) However, the fault cannnot 
be diagnosed to the functional CPU or its duplicate, so 
both must be replaced if an error is signalled from the 
CPU comparator (C5). Both devices would automatically be 
replaced if they were within a single self checking chip.
In the three categories of faults for the clock generation 
circuitry (period, mark-space and stuck-at faults) all
faults are detected with the two checkers employed (Cl and 
C2). If only a subset of all faults is considered, then 
the checking mechanisms can be greatly simplified from 
those necessary to detect all faults. The predicted 
effects on the address decoder outputs due to all single 
internal faults, for example, has significantly reduced
the complexity of the required 1-out-of-n checker, since 
it is only necessary for the checker to detect two output 
errors (two or zero active outputs).
The single bit parity code, used for the encoding of all 
buses in the experimental computer, will detect all faults 
which cause an odd number of erroneous bits in an encoded 
word. Assuming that all combinations of bits occur in 
both the data information bits and the parity bit at some
time during normal operation, then all faults except the
bus stuck at a codeword will be detected eventually. 
If all faults are to be detected immediately, then the 
encoding must be capable of detecting all bit errors.
The comments made above for a parity encoded bus also 
apply to the storage of parity encoded data in RAM or 
EPROM. The comparison of address parity bits, stored 
along side the data in memory, with generated address 
parity bits during a memory read operation allows all 
faults in the storage of these bits to be detected, as 
well as faults in the comparison process. The comparison 
process will also detect a number of internal memory 
addressing errors. If the addressing errors detected are 
inadequate, then a more sophisticated code than parity 
will be required for the stored address check bits.
-306-
The diagnosis of an isolator fault from the checker 
outputs to the buffer associated with that isolator is 
dependent on two important factors. Firstly, the obser­
vation of fault indications from a halted system, which 
assist the fault diagnosis, and secondly, the use of 
both data paths during normal operation (CPU board <— > 
memory board and CPU board <— > I/O board), to ensure that 
certain failures are detected. The latter factor requires 
that the system is operated in a particular manner for the 
detection of failures and is therefore an 'operation for 
test'.
Another operation for test used in the experimental 
computer, more as a confidence test rather than a 
necessary operation for the detection of failures, is the 
accessing of data from memory with known parity errors. 
The manual operation of the reset, halt and test switches 
are also operations for test.
-307-
9.7 : REFERENCES
9.1) M6800 microcomputer system design data,, plus data 
sheets for MC6821 (DS 9435), MC6875 (DS 9485) and 
MC14411 (ADI-306); Motorola Semiconductor Products 
Inc.
9.2) Fault diagnostics for microprocessor based systems 
- T. J. Hollis; Project report for the degree of 
B.Sc. in Electrical Engineering; University of Bath, 
England; 1980; Chapter 4 and Appendix 8.
9.3) Fail-safe capacitors - G. Lloyd; Electronic Engin­
eering; June, 1984; pp. 71-2.
9.4) The TTL book for design engineers - The engineer­
ing staff of Texas Instruments components group;


















































































— \  
— /
>












FIGURE 9.2 SCHEMATIC FOR THE UNCHECKED CPU BOARD
BACKPLANE
DBO-















c —  Ov
>



































































SCHEMATIC FOR THE UNCHECKED MEMORY BOARD




















D R - - R /W I
\ —























Tx D a ta
0 7
RS



















































( X 8 )
SCHEMATIC FOR THE UNCHECKED I/O BOARD
SELF CHECKING INTEGRATED CIRCUIT
CONTROL 






















w a n A t«
rat









- H  ERROR
1—OUT—OF—2
CMP -  TSC EQUALITY CHECKER 
_ *_ 1_zOUJj-OF-2_ENCODED_ PAJR______________ j
FIGURE 9.5 A SELF CHECKING 6800 MICROPROCESSOR
CLOCK GENERATOR MC6875
DrDinmr >1loO rtKlvJUIG 
SIGNAL
ru rrv ro )
I
































FIGURE 9.7 CHECKING THE RESET LINE
BUFFERS
DATA BITS  N
(C8)
PARITY BIT
FIGURE 9.8 PROCESSING PARITY ENCODED DATA












A A -H S E V E N
■DPC
(ODD PARITY BIT)







A -H  S E V E N
V
) A C 8 ~
AC15
APC1











































































I/O BOARD PARITYFIGURE 9.11
-315-




































FIGURE 9.12 EPROM PARITY







































FIGURE 9.13 RAM PARITY




















FIGURE 9.14 CHECKING THE ADDRESS DECODING




























CMP -  TSC EQUALITY CHECKER 
*  -  1—OUT—OF—2 ENCOOED PAIR

















































































,  v SINGLE TRANSMITTER




, x MULTIPLE TRANSMITTER
( b )  TO SINGLE
RECEIVER STRUCTURE
MASTER SLAVES
U ni =  Unidirectional Isolator
,  v MASTER TRANSCEIVER
( c )  TO MULTIPLE SLAVE
TRANSCEIVER STRUCTURE
Bi =  Bidirectional Isolator


























































































Uni =  Bidirectional Isolator * =  M (MEMORY BOARD)
Bi *  Unidirectional Isolator OR 1 ( I/O  BOARD)






























ro FIGURE 9.22 ERROR CONTROL LOGICt—* ----------------I
‘ INTEGRATED CIRCUIT TYPE
STRUCTURE SINGLE GATES 74LS151 4-I/P MORPHIC AND
3 LEVELS CASCADED 4-I/P CATES
2-LEVEL 256 x 8-I/P AND CATES XX
AND/OR 2 x 128-I/P OR GATES XX 4-I/P GATE USES
IMPRACTICAL IN THIS FORM AND/OR STRUCTURE
CASCADED 6 LEVELS
2-I/P MORPHIC 24 x 2-I/P AND GATES 6 LEVELS 4 LEVELS
AND GATES 18 X 2-I/P OR GATES 7 PACKAGES 3 PACKAGES
(FIG. 6.27a) 10.5 PACKAGES
CASCADED 4 LEVELS
2-I/P MORPHIC 16 X 2-I/P AND GATES
AND GATES 2 X 2-I/P OR GATES XX XX
WITH LEVEL 8 X 4-I/P OR GATES XX XX
MERGING 4 X 4-I/P AND GATES
(FIG. 6.27b) 10.5 PACKAGES









U n i 1—OUT—O f—8 
CHECKERCNTRL \y


















FAULT INDICATION AND 
ERROR CONTROL LOGIC






RC.Rfl.RU.RI - T i e - u p  re a i. to re  on 
a pp ro p ria te  b u t u
R0M.R8I -  T ie -up  rea le to re  on 
d a to  bua on ly
NO ENABLE SICNALS 
SHO W  FOR BUFFERS 
B132.B5.B6 k B8




SET OF 4 
TSC PARITY 
CHECKERS
U n i Un id irectiona l la o la to r
TRI-STATE BUFFER
FIGURE 9.24 SCHEMATIC FOR THE EXPERIMENTAL SELF CHECKING COMPUTER
CPU BOARD BACKPLANE
DC0-DC7, DPC (<>2«0) -> 1
DC0-DC7, DPC ($2*1) * 1 DB0-DB7, DPB -» 1
AC0-AC15 -) 1 AB0-AB15. APB1, APB2 -> 1
APCI, APC2 - 1 $2B _TLT
$2C _n_r VMAB ■ 0
VHAC - 0 R/RB « 1
r /Rc -> 1 ESTb * 1
Rstc - l CPB ~u-i_
CPC


























ENO/1-ENE/F, ENA, ENf> - l -u~i_
_rv_r : CHANGING LEVEL
* 0 : DRIVEN LEVEL
* 1 s DRIVEN LEVEL
-» 1 : TIE-UP RESISTOR LEVEL
FIGURE 9.25 THE SYSTEM WITH CPUs HALTED
UNIT DETECTABLE FAULTS CHECKER DETECTING FAULTS
CPU ALL FAULTS EXCEPT IDENTICAL FAILURES IN BOTH PROCESSORS C5
RESET LINE ALL FAULTS C3
HALT LINE ALL FAULTS C4
CLOCK : BUS 02 PERIOD, MARK-SPACE, STUCK-AT Cl
: MPU$1,MPU$2 STUCK-AT C2
DECODER
NO ACTIVE OUTPUT 
TWO ACTIVE OUTPUTS 
(EVEN NO. OF ACTIVE OUTPUTS)
C12
(Checkers are those in Fig. 9.24)
FIGURE 9.26 VARIOUS SYSTEM FAULTS
-323-







ALL SINGLE AND MANY MULTIPLE FAULTS IN 
DATA AND DATA PARITY BITS
DATA
PARITY C9D etc. READ
ALL FAULTS IN ADDRESS PARITY BITS
ADDRESS
PARITY CIO READ
ANY FAULT CAUSING THE EFFECTIVE INTERNAL 
ADDRESS TO BE AN ODD HAMMING DISTANCE FROM 
THE DESIRED ADDRESS IN ITS HIGH (A8-A15) 
OR LOW (A0-A7) BYTES
RAM
ALL SINGLE AND MANY MULTIPLE FAULTS IN 
DATA AND DATA PARITY BITS
DATA
PARITY C9D etc. READ
ALL FAULTS IN ADDRESS PARITY BITS
ADDRESS
PARITY Cll READ
ANY FAULT CAUSING THE EFFECTIVE INTERNAL 
ADDRESS TO BE AN ODD HAMMING DISTANCE FROM 
THE DESIRED ADDRESS IN ITS HIGH (A8-A15) 
OR LOW (A0-A7 > BYTES
BUFFER









Notes : 1) Faults detected exclude pin stuck-at faults, except where stated.
2) An address parity error is an error in parity bits API and/or AP2.
3) The checkers are those in Fig. 9.24.
4) * indicates that a fault nay be detected when the CPUs are halted.
5) C90 is the data bus checker of C9.
6) Read * Read fron that device.
7) Read/Write ■ read/write froa/to any device.




CHECKER INDICATIONS SINGLE FAULTS DETECTED (DATA BUS ONLY, INCLUDING PARITY)
C6D C7D C8 C9D C13 C14D
X X X B3b SA0/1 A ) , R B  S/C A ) , B 4 a  SAO A )
X X X B3b SAO/I (=U),RB S/C A ) , B 7 a  SAO ibi
X X B3b SAO/I (4*M>, D1 0/C (A, D2 0/C <4^),RB S/C (fr^ M)
X X X B4a SAO <<W),B4a SA1 ( A , R B M  S/C ( A
X X X B7a SAO (4*M),B7a SA1 ( A , RBI S/C ( A
X X T1BE S/C or 0/C A ),T1CE 0/C A ),T1BC 0/C A ) , B D 1  SAO A), RBI 0/C A ) , B 4 a  SA1 A ) , R B M  S/C A)
X TICE S/C ( A , B D 1  SA1 ( A ) , D 1  S/C (A ) ,T1BC S/C ( A )
X X T2BE S/C or 0/C'A),T2CE 0/C A ),T2BC 0/C A ) , B D 2  SAO A ) , R B 2  0/C A ) , B 7 a  SA1 A ) , R B I  S/C A)
X T2CE S/C ( A , B D 2  SA1 < A ) , D 2  S/C (A ) ,T2BC S/C ( A )
X X B4a SAO ( b )
X X B7a SAO ( ^»)
CPUs HALTED
X X B3b SAO
X X X B4a SAO
X X X B7a SAO
NOTES : 1) X » fault indicated.
2) SAO • stuck-at-0 ; SA1 ■ atuck-at-1.
3) O/C ■ open circuit ; S/C ■ short circuit.
M M
4) <— ■ read from memory board ; -* * write to memory board.
5) 4^ » read from I/O board ; ■ write to I/O board.
6) T1,RB1,D1,BD1 * any of these components in the bidirectional Isolators of II.
7) T2,RB2,D2,BD2 * any of these components in the bidirectional Isolators of 12.
8) CnnD is the data bus checker of Cnn.
9) Checkers, buffers. Isolators and bus tie-up resistors are those of Fig. 9.24.
10) Transistors, resistors, diodes and base drive buffers are those of Fig. B.5.
FIGURE-9.28 BIDIRECTIONAL ISOLATOR FAULTS
NORMAL OPERATIONl
CHECKER INDICATIONS SINGLE FAULTS DETECTED
C7* C9* C14*
X X X B3b SAO/1,RB S/C
X B4a SAO/1, ALL Tl FAILURES, RBI 0/C, RBE1 S/C, RC1 S/C
X B7a SAO/1, ALL T2 FAILURES, RB2 0/C, RBE2 S/C, RC2 S/C
X X RBI S/C
X X RB2 S/C
CPUs HALTED : ADDRESS BUSES 1 AND 2
X X X B3b SAO
X B4a SA1#
X B7a SA1#
CPUs HALTED : CONTROL BUS|
X X X 02B SAO , VMAB SA1, R/RB SAO, K5TB SAO, CPB SA1
X 02BM SAO , VMABM SA1, -R/RBM SAO, K3TBM SAO, CPBM SA1
X #2BI SAO , VMABI SA1, R/RBI SAO, K5TBI SAO, CPBI SA1
NOTES : 1) X =■ fault indicated.
2) SAO * stuck-at-0 ; SA1 = stuck-at-l.
3) O/C * open circuit ; S/C * short circuit.
4) Tl, RBI, RBE1, RC1 * any of these components in the
unidirectional isolators of II.
5) T2, RB2, RBE2, RC2 * any of these components in the
unidirectional Isolators of 12.
6) BM * between II and B4.
7) BI * between 12 and B7.
8) * * A1 for address bus 1 (A0-A7.AP1), A2 for address bus 2 
(A8-A15.AP2) or C for the control bus (02,VMA,R/R,K3T,CP).
9) Checkers, buffers and isolators are those of Fig. 9.24.
10) Transistors and resistors are those of Fig. B.9.
11) # * includes line between buffer and isolator.
FIGURE 9.29 UNIDIRECTIONAL ISOLATOR FAULTS
-326-
CHAPTER TEN : CONCLUSIONS AND FURTHER WORK
The purpose of this investigation has been twofold. 
Firstly, to devise a method for the design of totally self 
checking (TSC) circuits from first principles and, second­
ly, to develop a microprocessor based system with on line 
fault detection, so that faults can be diagnosed to 
individual integrated circuits (XCs), without the need for 
external equipment. These two objectives have been 
brought together in the experimental self checking compu­
ter of Chapter 9.
The first objective of the investigation has been achieved 
with a technique based on the testing requirements of 
single and cascaded logic gates. A TSC circuit is, by 
definition, self testing and fault secure. Fault security 
is automatic in a circuit where each output of the 1-out- 
of-2 encoded pair is computed with an independent sub­
circuit. Therefore it is only necessary to determine if 
the circuit is self testing, for it to be TSC. This is 
achieved in the case of 2-level AND/OR and OR/AND struc­
tures with the method proposed in Chapter 6. The Boolean 
function of the circuit depicted on a Karnaugh map is 
visually inspected to determine if the circuit is self 
testing. This is a rapid process, which will identify why 
the circuit is not self testing, so that it can be 
modified and re-evaluated.
The technique of using Karnaugh maps to design TSC 
circuits is not suitable for TSC circuits with a large 
number of inputs. However, complex TSC circuits or 
networks are generally constructed from a number of 
smaller.TSC circuits, either by design or necessity. 
Examples are the TSC comparator and TSC 1-out-of-n code 
checker respectively. The design of other TSC circuits, 
including m-out-of-n and Berger code checkers, should be 
attempted using the Karnaugh map method. The technique 
does not cater for TSC circuits constructed from EXOR 
gates, notably parity checkers, which have been largely
-327-
designed by intuition. This is an another area which 
could be explored.
The second objective of the investigation has been 
achieved by the application of a number of techniques to 
convert a minimal, but typical, microprocessor based 
system from being unchecked to fully checked. These 
techniques include the parity encoding of all data trans­
mission paths, the duplication of VLSI devices and control 
circuitry, the storage of address parity bits in memory 
and the installation of signal isolating circuits. Check­
ing circuits are then added at strategic points in the 
system to achieve the desired level of fault diagnosis 
from their output indications. These checking circuits 
check codes, compare two sets of inputs and check periodic 
signals. Additional circuitry is provided to display and 
process their outputs.
The major limitation of the checking circuits, is that if 
they are designed strictly according to the Boolean 
function of their specification,- then faults within them 
can mask the indication of a fault in the system they are 
monitoring. This problem is resolved by designing TSC 
checkers. This then brings together the two original 
objectives. The result is an experimental self checking 
computer for the demonstration and evaluation of both 
objectives.
The two principal objectives of the experimental computer 
have also been met. In the areas which are checked, 
faults in an IC or its associated components, such as 
isolating circuits, are diagnosable to that chip. If only 
board level fault diagnosis is necessary, then consider­
ably less fault indication and error control logic will be 
required.
The implementation of various self checking techniques has 
also been demonstrated. These are duplication and com­
parison, parity generation and checking, 1-out-of-n code
-328-
checking and periodic signal checking. It has been shown 
that duplication and comparison along with parity encoding 
are the two most widely used techniques.
The isolating circuits enhance the fault diagnosis of the 
system, rather than assisting its self checking abiltites. 
Overall, a considerable amount of additional hardware is 
required, primarily for the self checking techniques, but 
none of it is complex.
All types of single fault have been simulated in the 
isolating circuits and many single and multiple faults in 
other checked parts of the system. If the efficiency of 
the fault detection mechanisms is to be evaluated for all 
faults (single and multiple), then a computer simulation 
of faults in the system will be more appropriate than a 
physical fault simulation.
Additional checking mechanisms could be added to the 
experimental computer to make it totally self checking. 
Many more isolation circuits would then be required, 
resulting in a considerable increase in the hardware for 
checking. So far, there have been no constraints placed 
on the amount of hardware used for checking purposes. 
However, an examination of the checked computer reveals 
that it already has considerably more than five times the 
number of IC packages used in the original unchecked 
system. Clearly, such a system will only be economically 
viable if ICs are available which are self checking 
versions of existing devices, or provide a set of checking 
mechanisms for use with standard devices, as indicated in 
Chapter 8.
If a single chip or group of chips can be identified as 
faulty from the indications of the checking circuits, 
which might require the combined indications from both a 
halted and an operational system, then it would be possi­
ble to display this information, in words, on a terminal. 
This would extend the fault diagnosis capabilities of the
-329-
system from a manual to an automatic process and allow a 
rapid replacement of the faulty chip(s). Alternatively, 
this information could he used in fault tolerant applica­
tions to effect error correction and/or error recovery. 
In all cases it must be remembered that the checking 
circuits are TSC, so that an indicated error may be due to 
a fault in the checker itself and not a fault in the 
signals it is monitoring.
Proposals have been given in Chapter 9 for the construc­
tion of a self checking MC6800 microprocessor and a self 
checking MC6850 ACIA. Both schemes use the duplication 
and comparsion technique with the addition of a tri-state 
buffer. The MC6821 PIA can be checked in a similar manner 
to the ACIA, for its application in the experimental 
computer. However, it is more difficult to construct a 
general self checking PIA, because, externally, it is not 
known which I/O lines have been programmed as inputs and 
which as outputs. Access must be gained to the internal 
data direction and control registers for this information, 
or a redundant set of registers added externally. A self 
checking PIA needs to be the subject of a separate 
investigation.
The isolation circuits should be further developed to 
remove existing limitations in their design. Certain 
undetectable faults ideally need to be eliminated. In 
addition, the circuits rely on there being more than one 
data path, ie. a structure with a single transmitter and 
greater than two receivers, for the detection of certain 
faults. If the isolators are installed in single trans­
mitter to single receiver structures, then a number of 
additional faults become undetectable.
After further development, the isolating circuits can then 
be integrated as a separate package, or within the devices 
they are isolating. The latter is achieved by modifying 
the input and output circuitry of a gate, so that a 
physical stuck-at fault at an IC pin is impossible; i.e.
-330-
an input or output line of a gate can still be driven to a 
logic 0 or 1 in the presence of an internal input or 
output 'stuck-at' fault. However, whilst unswitched 
receiver input isolators would assist fault diagnosis in 
an unswitched single tranmitter to multiple receiver 
structure, switched isolators in all switched structures 
would not improve fault diagnosis, without the additional 
checkers between each transmitter output or receiver 
input and its isolating circuit. Some further investiga­
tion is therefore required into this aspect of isolation 
circuit integration.
The use of opto-couplers and transformers as isolating 
circuits was considered during the development of these 
circuits. Opto-couplers were rejected on the basis of 
their cost, a relatively high input current and a large 
propagation delay, whilst a commercially available trans­
former bus isolator package was rejected because it would 
not process a 1MHz clock waveform. However, both com­
ponents, in principle, are ideal as isolation circuits, 
particularly opto-couplers which are easily integrated, 
and should therefore be investigated further.
In most of the discussion about self checking checkers, 
only TSC checkers have been considered. The exception to 
this is the 1-out-of-n code checker used in the experimen­
tal computer, which, in general, is a self testing only 
checker. However for its specific application to check 
the outputs of a 3 to 8 line decoder, in which the effects 
on these outputs of all single faults have been ascertain­
ed, it is deemed to be TSC. Self testing only (STO) and 
partially self checking (PSC) checkers could have been 
employed in the experimental computer, since the immediate 
detection of errors is not essential, only the eventual 
detection of errors; i.e. there has been more of an 
emphasis on self testing rather than fault security. 
However, it is considered more important to develop and 
evaluate checkers suitable for any application, hence the 
bias towards TSC checkers. The additional hardware
-331-
required for TSC checkers is unlikely to be a problem when 
they are integrated.
In Chapter 6 the process of level merging was presented 
for TSC equality checkers constructed from cascaded 2- 
input Morphic AND gates. The technique, in general, 
reduces the circuitry required for these checkers and also 
their propagation delay. Its application is, however, 
dependent on the size of the comaprator. Whilst it is 
ideal for comparators implemented with discrete logic, as 
in the experimental computer, it is more likely than the 
2-level AND/OR or OR/AND structures, also presented in 
Chapter 6, will be adopted for integrated TSC comparators 
with less than nine input pairs.
The theory and design of TSC sequential circuits has not 
been extended from that given in Chapter 5, principally 
because there was no requirement for them in the experi­
mental computer. The most likely application of self 
checking sequential circuits in such a system, aside from 
any sequential control circuitry, is a multiple phase 
clock generator. It would therefore be worth invest­
igating the possibility of TSC sequential circuit design 
using a method based on Karnaugh maps.
It would take some considerable time to create a complete 
set of self checking versions of standard integrated 
circuits. In addition, the amount of error processing 
required in a system using self checking chips exclusive­
ly, where each chip has a l-out-of-2 encoded output error 
signal, would be vast. The fault coverage of such a 
system would also have to be carefully analysed.
Overall, at present, it is considered more beneficial to 
pursue the development and construction of integrated 
circuits containing a set of self checking mechanisms, 
including self checking checkers, which will allow self 
checking to be incorporated into a system using standard 
integrated circuits. The construction and development of
-332-
isolating circuit packages should also be pursued, so that 
they and self checking checkers can be positioned to 
provide adequate fault diagnostic information. A micro­
processor based system can then be designed with a 
minimum hardware overhead to provide the desired level of 
self checking and fault diagnosis.
-333-
APPENDIX A : PRACTICAL IN-CIRCUIT EMULATION
Extensive in-circuit emulation has been practically im­
plemented on the microprocessor based system depicted in 
Fig. A.l, using a Millenium Microsystem Analyser (in- 
circuit emulator). This work is detailed below.
A.l : OPERATION OF THE PROCESSOR SYSTEM
Refering to Fig. A.l, a terminal connected to the 
'terminal1 programmable communications interface (PCI), 
via RS232 or 20mA interfaces, can be effectively switched 
through the 'processor' PCI and further RS232 or 20mA 
interfaces to any one of sixteen processing units (com­
puters). The communication path between the terminal and 
its processor is 'full duplex'.
The processor is selected by a four bit binary code stored 
in the route latch which controls a multiplexer/demulti- 
plexer configuration, connecting the appropriate processor 
to the processor PCI. This code is indicated (in hexa­
decimal) on the (7-segement) route display.
Not all processors are available to the terminal, this 
being controlled by the 'allowed processor' switches. In 
addition, one of seven possible speeds of communication is 
selected by the 'baud rate selection' switch.
Each system has four terminals, so the circuitry shown in 
the dotted box in Fig. A.l is duplicated three times. 
Note that the terminal and processor PCIs operate in 
pairs.
A terminal connected via 20mA interfaces to the 'terminal' 
PCI, provides monitoring and control for the whole system.
A.2 : THE TESTS
A set of seven programs was written to test the
-334-
system, using the operating system of the Microsystem 
Analyser CA.1,A.2,A.3D. These reside in two EPROMs 
located on the analyser. Prompt and fault messages are 
displayed on the twenty character display of the instru­
ment. The tests run individually, or in sequence under 
emulator control and consist of the following:
1) Switch Test:
a) The emulator prompts for the 'allowed processor' 
and 'baud rate selection' switches to be set to 
5 5 ^  and then reads them to check this. An 
error message gives the switch number and the 
actual data read,
b) As for a), but switches set to AA-^ g.
2) EPROM Test.
A checksum is compiled and verified for each EPROM 
in turn. An error message gives the actual checksum 
compiled.
3) RAM Test.
An 8-bit binary number, which starts at 00^, 
Exclusively ORed with the most significant (MS) 
address byte of the first location in memory and the 
result stored in that location. The binary number 
is then incremented, Exclusively Ored with the MS 
address byte of the second location and stored in 
that location. This procedure continues until all 
the RAM has been written to. RAM contents are then 
read back and compared with that written. An error 
message details written and read data.
4) Display Test.
0, 1, 2 through to is written to route display
'0' and then repeated for displays 1, 2 and 3 in
succession. A software delay is incorporated 
between changes, so that the sequence can be 
observed. This is a purely visual test.
-335-
5) PCI Test 1.
Links are made between the processor PCI interfaces 
and the terminal PCI interfaces, as detailed in Fig.
A.2a. For each connection indicated, a complete 
binary count (00-FF^g) is transmitted at all possi­
ble baud rates (110-19200 baud) from the processor 
PCI to the terminal PCI and then vice versa. 
Interrupts normally used to indicate 'transmitter 
(TX) ready' and 'receiver (RX) ready' are inhibited. 
The error message details the two PCIs in use, the 
processor port selected, plus the direction and baud 
rate of communication, along with one of three 
faults:
a) No character transmitted.
b) No character received.
c) Character sent and character received (when 
different).
6) PCI Test 2.
This is the same as for PCI test 1, except it uses 
a different set of connections, as detailed in Fig.
A.2b, and also includes tests using the monitor PCI.
7) Interrupt Test.
Using the same links as for PCI test 2, this test 
detects the occurrence of transmitter and receiver 
interrupts for all PCIs, by transmitting 5 5 ^  at 
all baud rates and in both directions; terminal to 
processor and processor to terminal. The interrupts 
are interrupt requests (IRQ) for the terminal and 
processor PCIs and a non-maskable interrupt (NMI) 
for the monitor PCI. An EPROM contains software 
routines and vectors to handle the interrupts, 
replacing the upper control EPROM of the board. A 
prioritisation of interrupts occurs for the IRQ and 
this is also tested. An error message for the 
general interrupt test details the two PCIs in use, 
the processor port selected, plus the direction and 
baud rate of communication and one of four errors:
-336-
a) No IRQ detected.
b) Correct and actual levels of IRQ pribrity.
c) No NMI detected.
d) IRQ level detected instead of NMI.
The error message for the interrupt priority test 
gives either a) or b) above.
A.3 : THE TESTS IN USE
The tests have proved to be an effective means of testing 
and fault diagnosing the board, in what is essentially an 
automated procedure, although the emulator halts on 
errors. They also highlight the main limitation of in- 
circuit emulation and for that matter functional testing, 
which is circuit visibility, or rather a lack of it.
Consider the following example. In test 5) a PCI error is 
detected with a fault message of 'NO RX" , i.e. the 
receiving PCI has not detected a character. Referring to 
Fig. A.3, it cannot be ascertained directly if the fault 
lies within the:
i) Transmitting PCI - eg. theoretical but not physical 
transmission.
ii) Transmitter Interface - eg. component failure, 
iii) Interface Links - eg. not properly connected, 
iv) Receiver Interface - eg. component failure, 
v) Receiving PCI - eg. physical but not theore-tical 
reception.
It is assumed that the main data, address and control 
buses are functional, since this would have been determin­
ed from previous tests.
The fault is located by running another test routine, 
which continuously transmits a character at a predetermin­
ed baud rate between a definable pair of PCIs. An 
oscilloscope is then used to trace signal flow between the 
two PCIs.
A.4 : REFERENCES
A.l Millenium Microsystem Analyser; Operators Manual; 
Pub. No. 87000001, release 2.0 ;May 1980; Millenium 
Systems Inc.
A.2 Programming with uSA Microsystem Analyser - B. 
Hordos; uSA applications note no. 1; Millenium 
Systems Inc.
A.3 Diagnostic programming for microprocessor-based sys­





CLOCK |— * CPU 2KEPROM
TERMINAL
RX RDY TX RDY
TERMINAL
RX TX -- >
PCI
2 X












a) 20mA/RS232 indicate an
interface of that type.





































FIGURE A.l MICROPROCESSOR BASED SYSTEM USED WITH IN-CIRCUIT EMULATION
TERMINAL
PCI (No. ) 0 1 2 3
CONNECTED 
VIA INTERFACE (Type) RS232 20mA RS232 20mA RS232 20mA RS232 20mA
TO PROCESSOR 
PORT (No. ) 4 C 5 D 6 E 7 F
FOR SELECTION 
BY PROCESSOR PCI (No. ) 4 5 6 7
(a) CONNECTIONS FOR PCI TEST 1
TERMINAL
PCI (No.) 0 ]L
• 3 8
CONNECTED 
VIA INTERFACE (Type) RS232 20mA RS232 20mA RS232 20mA RS232 20mA 20mA
TO PROCESSOR 
PORT (No.) 0 8 1 9 2 A 3 B F
FOR SELECTION 
BY PROCESSOR PCI (No.) <I
c 6I 1 4
(b) CONNECTIONS FOR PCI TEST 2









(i) (ii) (ill) (iv) (v)
FIGURE A.3 PCI TEST EXAMPLE
-340-
APPENDIX B : SIGNAL ISOLATION CIRCUITS
B.l : INTRODUCTION
Sections 7.8.2 and 8.2.8 have demonstrated a need for 
signal isolation circuits. These sections have also 
discussed the positioning of isolators in various circuit 
structures. This appendix summarises the work carried out 
to design bidirectional and unidirectional isolating 
circuits, principally for the self checking computer of 
Chapter 9.
Isolating circuits are allowed to fail, but not in either 
of the following two ways:
1) A stuck-at fault at the signal terminal connected to 
the line from which it is providing isolation.
2) A undetectable short between the two signal terminals.
Both of these failures destroy the isolation properties of 
the circuit. The effects of single failures appropriate 
to each component used form the basis of fault tables for 
the various isolation circuits. These are mainly short 
and open circuits in resistors, diodes and transistor 
junctions. All of these failures are based on discrete 
components. Faults caused by substrate failures, (as 
opposed to circuit failures) when the components are 
integrated, are not considered.
The structure used to evaluate each isolating circuit is 
one of the following:
1) Single transmitter to two receivers.
2) Two transmitters to a single receiver.
3) Master transceiver to two slave transceivers.
They are all assumed to represent one bit of a k-bit 
parity encoded bus.
-341-
B .2 : SWITCHED UNIDIRECTIONAL ISOLATORS
The unidirectional isolating circuits used by Moreira de 
Souza et al CB.13 in their research computer have already 
been presented in section 7.8.2. The isolator they used 
for single transmitter to multiple receiver structures, 
a series resistor and CMOS buffer, is shown again in Fig. 
B.la. Note that the receivers are assumed to be high 
speed CMOS buffers CB.23, so the series CMOS buffer is now 
unnecessary. It is also assumed that only one path, out 
of the two indicated, is selected for data transmission 
at any time. Fig. B.lb gives the fault table for this 
structure. This reveals, as expected, that a short 
circuit isolating resistor is undetectable. Moreira de 
Souza et al overcome this problem by using wirewound 
resistors, which have a small probability of short circuit 
failures. However, wirewound resistors are expensive and 
difficult to integrate, so an alternative solution is 
required. The fault table in Fig. B.lb also shows that 
certain faults will be indicated when no data path is 
selected.
If the resistors in Fig. B.la are replaced by diodes, then 
the circuit and fault table become those in Fig. B.2. 
Some faults now create an indeterminate condition. This 
occurs when the anode of a diode is stuck-at 1 (SA1) and 
a 0 is applied to its cathode. The actual voltage levels 
at either end of the diode will depend on the forward volt 
drop across it, V^ ., as well as the source resistances of 
the SA1 fault and driving 0. For normal use, V^ . should be 
as low as possible, so that a transmitted 0 is still a 0 
at each receiver ('O' + V^ .). Germanium 0A47 diodes have 
been used for practical purposes CB.33, but modern schott- 
ky signal diodes are also suitable and readily integrated 
CB.43.
The fault table in Fig. B.2b also shows that a short
circuit diode is, again, not detectable. This situation
can be resolved, however, by replacing the diodes with 
✓
-342-
switched diodes, i.e. transistors. Each transistor is 
turned on when the data flow is through its associated
buffer. The circuit and fault table for this configura­
tion are given in Fig. B.3. The transistor and its base
resistor are selected on the basis of voltage and current 
levels, propagation delay and rise times. In the proto­
type, these were a Texas 2N2926A transistor CB.53 and a Ik 
ohm resistor. From Fig. B.3b, there are a number of 
indeterminate faults. A short circuit base resistor gives 
the condition mentioned above. In this instance, it is 
across the base-emitter junction of the transistor. This 
condition, however, will only occur when that transistor
is active. A receiver input SA1 also results in a 
similar condition, this time between the collector and the 
emitter of the transistor, but again, only when that 
transistor is active.
The other isolation circuit used by Moreira de Souza et 
al is a series diode for single transmitter to multiple 
receiver structures. This is shown again in Fig. B.4a. 
Fig. B.4b gives the fault table -for this structure, which, 
once again, has the indeterminate condition described 
above. All other faults, though, are definitely detect­
able. Replacing the diodes with transistors is therefore 
unnecessary. If they were, a base drive buffer output 
SA1 fault, i.e. the transistor permanently turned on (see 
Fig. B.3a>, is not detectable in any case.
B. 3 : BIDIRECTIONAL ISOLATORS
The experimental self checking computer, described in 
Chapter 9, has a bidirectional data bus. Moreira de Souza 
et al do not consider isolators for such a bus CB.13. 
Their memory and peripherals have separate data in and 
data out lines, so the data bus consists of two unidirec­
tional buses. The data bus of the experimental computer 
could be split into two undirectional buses for isolation 
purposes, but this would then not represent a conventional 
system.
-343-
A master-slave transceiver bus structure is formed by the 
merging of the structures in Figs. B.3a and B.4a. The 
isolation circuits used in the structures of Figs. B.3a 
and B.4a. are also merged to form an isolating circuit for 
the master-slave structure. Then the overall circuit and 
fault table become those in Fig. B.5. Another type of 
indeterminate condition is now introduced. This occurs 
when the cathode of a diode is stuck-at-0 (SAO) and a 1 is 
applied to its anode.
In Fig. B.6 a full fault analysis is performed on the 
fault table of Fig. B.5b. Indeterminate conditions have 
been resolved by practical experiment; i.e. checker 
indications represent the most likely effect of faults 
which create inderminate conditions. A short circuit base 
resistor is the only fault, in general, which is undetec­
table. However, the buffer which supplies the transistor 
base current may be doing so for more than one isolating 
circuit. Depending on the drive capability of this buffer 
and the value of the base resistors, then a short circuit 
base resistor may sufficiently lower the output of the 
buffer, so that some or all of the other transistors turn 
off, in which case the fault will be detected. The 
addition of a base-emitter resistor, as used in the next 
section, is resisted, since it would make the isolating 
circuits more complex.
For three fault indications identified in Fig. B.6, the 
fault cannot be diagnosed to one of the three buffers. 
Note that isolators are considered to be part of their 
associated buffer. However, the fault indications for no 
system activity, see Fig. B.6, resolve this problem.
B .4 : UNSWITCHED UNIDIRECTIONAL ISOLATORS
All the unidirectional and bidirectional structures con­
sidered so far have assumed the data path to be switched. 
This assumption is not satisfactory for address and
-344-
control buses, in particular, where all buffers are 
permanently enabled. The single transmitter to multiple 
receiver structure is the most appropriate for these 
unidirectional buses. An alternative isolator to the 
switched transistor is therefore required.
The first circuit evaluated for this purpose, was that 
shown in Fig. B.7. It is similar to the input circuitry 
of a standard TTL logic gate CB.63. However, it is 
unusable, because two input faults are undetectable. 
These are base-emitter and collector-emitter shorts in 
transistor Tl. Various modifications were made to the 
circuit to detect these faults. However, the modifica­
tions then created a significant number of undetectable 
faults in other parts of the circuit.
A conventional single stage transistor amplifier was then 
considered, as shown in Fig. B.8a. Fig. B.8b gives the 
fault table for this circuit. Note that a checker is not 
required at the input of any receiver. Since the buffers 
are permanently enabled, any fault indication at a 
receiver input will also be indicated at its output.
The fault table in Fig. B.8b reveals that a short circuit 
input resistor is not detected. This assumes that neither 
the transmitter or the isolating transistor is destroyed. 
The ideal solution is for a short circuit input resistor 
to sufficiently drag down the output level of the trans­
mitter, such as to turn off the transistor in all other 
isolating circuits connected to this node. This does not 
happen in the circuit of Fig. B.8a, because the input 
level, under these conditions, is still higher than the 
level at which the transistor turns off, referred to as 
the trigger level. A transistor will turn off under these 
conditions, though, if the resistance of its input 
resistor is increased to greater than 10k ohms. However, 
the propagation delay of the isolator is then unacceptably 
increased. An alternative solution is to add a base- 
emitter resistor to the circuit of Fig. B.8a. This will
-345-
divert current away from the base of the transistor, 
thereby increasing the trigger input level. A value of 
820 ohms was selected in the prototype for this purpose, 
on the basis of trigger level and propagation delay.
Fig. B.9 shows the modified structure and its fault table. 
Unfortunately, open circuit base-emitter and collector 
resistors are both, in general, undetectable. An open 
circuit collector resistor is equivalent to an open 
circuit tie-up resistor. During its high impedance state, 
a bus line with an open circuit tie-up resistor will 
float. In this state it is possible for the line to look 
like a 0. If so, the fault will be detected.
If a base-emitter resistor is open circuit, then undetec­
table second faults, excluding collector resistor faults, 
are a short circuit base resistor and an open circuit 
base-emitter resistor in the other isolating circuit. The 
probability of either of these faults occurring as second 
faults, is small. The probability of detecting a second 
fault, as well as the probability of the first fault not 
being an open circuit base-emitter resistor, are both 
high, typically greater than 90%.
In Fig. B.10 a full fault analysis is performed on the 
fault table of B.8b. All indicated faults are diagnosable 
to their source, one of the three buffers.
-346-
B .5 : REFERENCES
B.l) A research oriented microcomputer with built in
auto-diagnostics - J. Moreira de Souza, E. Peixoto 
Paz, C. Landrault; FTCS-6*; pp. 3-8.
B.2) High-speed CMOS replacements for LS TTL promise
continued viability of jellybean logic - R. H. 
Cushman; EDN; March 17, 1983; pp. 64-74.
B.3) Mullard Technical Handbook, Book One, Semiconductor
Devices, Part 3, Diodes; Mullard Ltd.; London, 
England; September 1977.
B.4) 1A, 3A Schottky Rectifier Diodes; 11DQ, 31DQ series;
Bulletin E2126; International Rectifier; Surrey, 
England.
B.5) Texas Instruments Semiconductor Components Data Book
Four, Transistors; Texas Instruments Ltd; Bedford, 
England; 1971.
B.6) The TTL book for design engineers - The engineer­
ing staff of Texas Instruments components group; 
Texas Instruments; Fourth European edition; 1980.
* see below.
FTCS-1 : The 1st Annual International Symposium on Fault-
Tolerant Computing; Pasadena,California,USA; March 1971.
FTCS-2 : The 2nd Annual International Symposium on Fault-
Tolerant Computing; Boston, Massachusetts, USA; June
1972; IEEE Pub. No. 72 CH0623-4C.
FTCS-3 : The 3rd Annual International Symposium on Fault-
Tolerant Computing; Palo Alto, California, USA; June
1973; IEEE Pub. No. 73 CH0772-4C.
FTCS-4 : The 4th Annual International Conference on
Fault-Tolerant Computing; Urbana, Illinois, USA; June
1974; IEEE Pub. No. 74 CH0864-9C.
FTCS-5 : The 5th Annual International Symposium on Fault-
Tolerant Computing; Paris France; June 1975; IEEE Pub. 
No. 75 CH0974-6C.
FTCS-6 : The 6th Annual International Symposium on Fault-
Tolerant Computing; Pittsburgh, Pennsylvania, USA; June 
1976; I.EEE Pub. No. 76 CH1094-2C.
FTCS-7 : The 7th Annual International Conference on
Fault-Tolerant Computing; Los Angeles, California, USA; 
June 1977; IEEE Pub. No. 77 CH1223-7C.
FTCS-8 : The 8th Annual International Conference on
Fault-Tolerant Computing; Toulouse, France; June 1978;
IEEE Pub. No. 78 CH1286-4C.
-347-
FTCS-9 : The 9th Annual International Symposium on Fault-
Tolerant Computing; Madison, Wisconsin, USA; June 1979; 
IEEE Pub. No. 79 CH1396-1C.
FTCS-10 : The 10th Annual International Symposium on
Fault-Tolerant Computing; Kyoto, Japan; October 1980; 
IEEE Pub. No. 80 CH1604-8.
FTCS-11 : The 11th Annual International Symposium on
Fault-Tolerant Computing; Portland, Maine, USA; June 
1981; IEEE Pub. No. 81 CH1600-6.
FTCS-12 : The 12th Annual International Symposium on
Fault-Tolerant Computing; Santa Monica, California, USA; 
June 1982; IEEE Pub. No. 82 CH1760-8.
FTCS-13 : The 13th Annual International Symposium on
Fault-Tolerant Computing; Milan Italy; June 1983; IEEE 
Pub. No. 83 CH1894-5.
FTCS-14 : The 14th Annual International Conference on
Fault-Tolerant Computing; Kissimmee, Florida, USA; June 
1984; IEEE Pub. No. 84 CH2050-3.
FTCS-15 : The 15th Annual International Symposium on
Fault-Tolerant Computing; Ann Arbor, Michigan, USA; June 
1985; IEEE Pub. No. 85 CH2143-6
1979 Test Conf, : 1979 IEEE Test Conference; Cherry Hill, 
New Jersey, USA; October 1979; IEEE Pub. No. 
79 CH1509-9C.
1980 Test Conf. : 1980 IEEE Test Conference; Philadelphia, 
Pasadena, USA; 11-13 November 1980; IEEE New York 1980.
Autotestcon '80 : IEEE Autotestcon '80; Washington, DC, 
USA; 2-5 November 1980; IEEE New York 1980.
Infotech : Infotech State-of-the-Art Report on Computer









B1 TX TO B2 B1 TX TO B3
Cl C2 C3 C4 C5 Cl C2 C3 C4 C5
B1 0/P SAO X* X* XA X XA XA XA X
B1 0/P SA1 X X X X X X X X
B2 I/P SAO XA X XA
B2 I/P SA1 X X X
B3 I/P SAO XA XA k
B3 I/P SA1 X X X
R1 0/C X X X
R1 S/C
R2 0/C X X X
R2 S/C
RT 0/C
RT S/C X X X X X X X X
(b) FAULT TABLE
Notes 1) B2/B3 enabled when transmission is via that buffer.
2) X - fault indicated.
3) X* • fault indicated with -no buffer enabled.
4) r. Iln fv.
 A V  ► j^>-VRX follows VTX since Iln o.
t
VTX VRX
5) pV. ... SAO Can still drive line regardless
" SA1 of fault.
acts as tie-down/up resistor 
but i 
Vcc.
■vW   SAO
t
potential divider action
(assumed still at *1* in table above)
6) Tie-up resistors not required at C2/C3. In fact they 
would be detrimental to circuit operation.
7) R1 S/C. R2 S/C and possibly RT 0/C not detected.
FIGURE B .1 RESISTORS AS UNIDIRECTIONAL ISOLATORS
-349-
D A





FAULT B1 TX TO B2 B1 TX TO B3
Cl C2 C3 C4 C5 Cl C2 C3 C4 C5
B1 0/P SAO X* X* X* X XA X* X A X
B1 0/P SA1 X X X X X X X X
B2 I/P SAO X* X XA
B2 I/P SA1 # # # # # # * #
B3 I/P SAO X* X A X
B3 I/P SA1 # # » * # # 1 #
D1 0/C X X X
D1 S/C












B2/B3 enabled when transmission is via that buffer.
■ fault indicated.
= fault indicated with no buffer enabled.
• indeterminate fault : '0'-----1<]--SA1
and R2 keep B2 and B3 inputs at 1 when B1 outputs a 
S/C a B1 0/P SA1 Any of these resistors open
S/C = B2 I/P SA1 ? circuit results in a floating
S/C a B3 I/P SA1 ) input and a possible fault.
S/C and D2 S/C are not detected.





















Tl/2 * 2N2369A 
R2/4 » lk







B1 TX TO B2 B1 TX TO B3
Cl C2 C3 C4 C5 Cl C2 C3 C4 C5
B1 0/P SAO X* X X X A X X
B1 0/P SA1 X X X X X X
B2 I/P SAO X* X*
B2 I/P SA1 # # #
B3 I/P SAO X A XA X
B3 I/P SA1 #- * *
B4 0/P SAO X X
B4 0/P SA1 X
B5 0/P SAO X X
B5 0/P SA1 X
T1BE 0/C X X
T1BE S/C X X
T1BC 0/C X X
T1BC S/C X
TICE O/C X X
TICE S/C X
T2BE 0/C X X
T2BE S/C X X
T2BC 0/C X X
T2BC S/C X
T2CE 0/C X X
T2CE S/C X
R2 0/C X x
R2 S/C # « *
R4 0/C X X
R4 S/C # « «
<b) FAULT TABLE
Notes : 1) B2/B3 enabled when transmission is via that buffer.
2) X - fault indicated.
3) X* - fault indicated with no buffer enabled.
4) # • indeterminate fault : '0' —  SA1
5) Transistor BC S/C provides a diode (BE) as required, 
except that it is permanently on.
6) R1 S/C ■ B1 0/P SA1 ) Any of these resistors open
R3 S/C ■ B2 I/P SA1 r circuit results in a floating
R5 S/C s B3 I/P SA1 ' input and a possible fault.






C2 parity checker 
Dl/2 * 0A47
(a) CIRCUIT
FAULT B1 TX TO B3 B2 TX TO B3
Cl C2 C3 C4 Cl C2 C3 C4
B1 0/P SAO X* X A X ** » X
B1 0/P SA1 X X X
B2 0/P SAO X* X A X XA XA X
B2 0/P SA1 X X X
B3 I/P SAO X A X X A X
B3 I/P SA1 # « * # « #
D1 0/C X X
01 S/C X
D2 0/C X X
D2 S/C X
(b) FAULT TABLE
Notes i 1) B1/B2 enabled when transmission is via that buffer.
2) X - fault indicated.
3) X* ■ fault indicated with no buffer enabled.
4) # - indeterminate fault i '0'-- 1<]--- SA1.
5) R keeps B3 input at 1 when B1 or B2 output a 1.
6) R S/C s B3 I/P SA1
R 0/C results in B3 input floating and a possible fault.
FIGURE B.4 DIODES AS UNIDIRECTIONAL ISOLATORS - 2
-352-
'parity checker 
Tl/2 = 2N2369A 











B1 TX TO B2
Cl C2 C3 C4 C5 C6
+ A
+ A +  A
X
B1 TX TO B3
Cl C2 C3 C4 C5 C6
XA
+  A + A
B2 TX TO B1










































B2/B3 enabled when transmission is via that buffer. 
X = fault indicated.
a  = fault indicated with no buffer enabled.
# » indeterminate fault : '0' «1----SA1
+ * indeterminate fault : SAO J<j----' 1'
Transistor BC S/C provides a diode (BE) as required 
except that it is permanently on.
R1 S/C s Bib SA1 ) Any of these resistors open
R2 S/C 3 B2a SA1 \ circuit results in a floating
R3 S/C = B3a SA1 ) input and a possible fault.
FIGURE B.5 A BIDIRECTIONAL ISOLATING CIRCUIT
-353-
NORMAL OPERATION 1
CHECKER INDICATIONS SINGLE FAULTS DETECTED
Cl C2 C3 C4 C5 C6
X X X Bib SAO/1 lb),R\ S/C (^»),B2a SAO lb)
X X X Bib SAO/1 <^),R1 S/C (^>),B3a SAO lb)
X X Bib SAO /1 (4^ 4-2), Dl O/C ( )  , D2 O/C (4^),R1 S/C < ^ M >
X X X B2a SAO <^M),B2a SA1 (4^),R2 S/C (<^)
X X X B3a SAO (<^«^),B3a SA1 (^),R3 S/C
X X T1BE S/C or O/C (^)),T1CE O/C (^),T1BC O/C (^  > , BD1 SAO (^),RB1 O/C (^)),B2a SA1 ( b  ), R2 S/C (^)
X TICE S/C { A , B D 1  SA1 ( A ),D1 S/C < ) ,T1BC S/C ( < b  )
X X T2BE S/C or O/C <^),T2CE O/C (=U),T2BC O/C (^4),BD2 SAO (^),RB2 O/C (^),B3a SA1 <^>,R3 S/C lb)
X T2CE S/C ( <rb , BD2 SA1 (A ) , D 2  S/C ( A  ) ,T2BC S/C ( A )
X X B2a SAO ( ^ )
X X B3a SAO ( ^)>
NO BUFFERS ENABLED
X Bib SAO
X X B2a SAO
X X B3a SAO
NOTES : 1) X * fault Indicated.
2) SAO * stuck-at-0 ; SA1 ■ stuck-at-l.
3) O/C - open circuit ; S/C * short circuit.
4) £  » read from B2 ; b  . write to B2.
5) <2 * read from B3 ; b  « write to B3.
6) For fault diagnostic purposes:
a) R1 associated with B1.
b) Dl, Tl, RBI, R2 and BD1 associated with B2.
c) D2, T2, RB2, R3 and BD2 associated with B3.
7) RBI or RB2 short circuit are both undetectable.
8) * . fault not diagnosable to Bl, B2 or B3 from checker indication.






T1BE S/C, TICE S/C and 

































Tl/2 * 2N2369A 




FAULT Cl C2 C3 FAULT Cl C2 C3
B1 0/P SAO X X X T2BE S/C X
B1 O/P SA1 X X X T2BC O/C X
B2 I/P SAO X T2BC S/C X
B2 I/P SA1 X tScE O/C X
B3 I/P SAO X T2CE S/C X
B3 I/P SA1 X RBI O/C X
TlBE O/C X RBI S/C
TlBE S/C X RC1 O/C
TlBC O/C X RC1 S/C X
TlBC S/C X RB2 O/C
TICE O/C X RB2 S/C
TICE S/C X RC2 O/C X
T2BE O/C X RC2 S/C
RU1 S/C, RB2 S/C, RC1 O/C and RC2 O/C 
not detected
<b) FAULT TABLE

















T27 parity checker 
Tl/2 = 2N2369A 






FAULT Cl C2 C3 FAULT Cl C2 C3
Bib SAO X X k T2BC S/C X
Bib SA1 X X X T2CE O/C x
B2a SAO X T2CE S/C x
B2a SA1 X RBI O/C X
B3a SAO X RBI S/C X X
B3a“SAT- X RBE1 O/C
TlBE O/C k RBE1 S/C X
T1BE S/C X RC1 O/C
TlBC O/C X RCl S/C k
TlBC S/C X RB2 6/C X
TICE O/C k RB2 S/C k k
TICE S/C k RBE2 O/C
T2BE O/C X RBE2 S/C X
T2BE__S_/_£_ X RC2 O/C
T2BC O/C X RC2 S/C k
RBE1 O/C, RBE2 O/C, RC1 O/C and RC2 O/C 
not detected
(b) FAULT TABLE
FIGURE B.9 UNSWITCHED UNIDIRECTIONAL
ISOLATOR - 3
CHECKER INDICATIONS SINGLE FAULTS DETECTED
Cl C2 C3
X X X Bib SAO/1
X B2a SAO/I, ALL Tl FAILURES, RBI O/C, RBE1 S/C, RCl S/C
X B3a SAO/I, ALL T2 FAILURES, RB2 O/C, RBE2 S/C, RC2 S/C
X X RBI S/C






X = fault indicated.
SAO * stuck-at-0 ; SA1 “ stuck-at-1.
O/C - open circuit ; S/C ■ short circuit.
For fault diagnostic purposes:
a) Tl, RBI, RBE1 and RC1 are associated with B2.
b) T2, RB2, RBE2 and RC2 are associated with B3.
RBE1, RBE2, RC1, or RC2 open circuit are all undetectable
FIGURE B.IO FAULT ANALYSIS FOR THE UNSWITCHED
UNIDIRECTIONAL ISOLATOR - 3
-356-
