On-line reconfiguration of systolic arrays by Singh, Karunesh Pratap
CENTRE FOR NEWFOUNDLAND STUDIES 
TOTAL OF 10 PAGES ONLY 
MAY BE XEROXED 
(Without Author's Permission) 



ON-LINE RECONFIGURATION 
OF SYSTOLIC ARRAYS 
By 
@ Karunesh Pratap Singh, B. Tech. 
A thesis submitted to the School of Graduate Studies 
in partial fulfillment of the requirements for the degree of 
Master of Engineering 
St. John's 
Faculty of Engineering and Applied Science 
Memorial University of Newfoundland 
June, 1992 
Newfoundland Canada 
1+1 National library of Canada Bibliotheque nationalo du Canada 
Acquisitions and Oireclion des acquisitions ct 
Bibliographic Services Branch des services bibliographiqucs 
395 Welhnglon SHeet 
Ottawa, Ontano 
KlAON-1 
395, rue Wr.l'1ngton 
Ottawa (Ontario) 
K1AON4 
The author has granted an 
irrevocable non-exclusive licence 
allowing the National Library of 
Canada to reproduce, loan, 
distribute or sell copies of 
his/her thesis by any means and 
in any form or format, making 
this thesis available to interested 
persons. 
The author retains ownership of 
the copyright in his/her thesis. 
Neither the thesis nor substantial 
extracts from it may be printed or 
otherwise reproduced without 
his/her permission. 
L'auteur a accorde une licence 
irrevocable et non exclusive 
permettant i1 Ia Bibliotheque 
nationale du Canada de 
reproduire, prater, distribuer ou 
vendre des copies de sa these 
de quelque maniere et sous 
quelque forme que ce soit pour 
mettre des exemplaires de cette 
these a Ia disposition des 
personnes inb~ressees. 
L'auteur conserve Ia propriete du 
droit d'auteur qui protege sa 
these. Ni Ia these ni des extraits 
substantiels de celle-ci ne 
doivent etre imprimes ou 
autrement reproduits sans son 
autorisation. 
ISBN 0-315-78121-1 
Canada 
ABSTRACT 
Various existing reconfiguration algorithms for array proces8ors cannot be used 
efficiently for on-line reconfiguration of the array because they require a ccntml 
processor to initiate and control the rcconfiguration. In nddiLion, most of the 
existing algorithms assume that the switching network is operationally fault-free. 
This report presents an on-line reconfiguration scheme for array processors. 
The proposed algorithm can tolerate both processing element failure and switching 
network failure. The processing elements and switches are of a self-testing Lypc and 
link failures are detected by the processing elements (by using parity bi L checks). 
The array is provided with a bottom row of spare cells and when a processing 
element detects either a self fault or a link failure, it invokes the rcconHguration. A 
downward global shift (for the particular column) is performed to accomplish the 
reconfiguration. A number of reconfiguration requests are generated by the pro· 
cessing elements and switch modules to facilitate the rccoufiguration. The network 
is modified and links for propagation of rcconfiguration request arc added. This 
scheme makes full use of non-faulty partial results and it blocks the faulty partial 
results. 
The reconfiguration in the case of a processing element failure is completed in 
two stages while the reconfiguration in the case of a link failure is completed in a 
single stage. The links are duplicated to achieve redundancy and in the case of a 
link failure the spare link is used. 
II 
ACKNOWLEDGEMENTS 
I arn grat.dul to Dr. H.. Venkatcsan for his guidance and constant encouragement 
throughout. the duration of my program. 
I wish to thank the School of Graduate Studies and Faculty of Engineering, 
M<!rnorial University for the financial support during my program. 
I am indebted to Dr. G.H. George for his invaluable help in making this thesis 
more readable and presentable. 
Finally, I would like to thank my wife, Ranja.aa for her support and help during 
my sl.ay in Sf.. John's. 
iii 
Contents 
ABSTRACT 
ACKNOWLEDGEMENTS 
List of Figures 
List of Tables 
List of Symbols 
1 INTRODUCTION 
1.1 Thesis Organization . 
2 LITERATURE REVIEW 
2.1 Concept of Systolic Arrays 
.. 
II 
... 
Ill 
vii 
X 
xi 
4 
1 
2.1.1 Broadcast inputs, move results and weights stay 7 
2.1.2 Results stay, inputs and weights move in opposite dircr.t.ions H 
2.1.3 Weights stay, results and inputs move in opposite directions !) 
2.2 Fault Detection Schemes . . .... 
2.2.1 Matrix Encoding Methods 
2.2.2 Recomputing with Shifted Operands (RESO) 
2.2.3 Triple Data Redundancy . . . . . . . . . . . . 
10 
10 
II 
2.2.4 Comparison with Concurrent Redundant Computation (CCRC) 15 
2.2.5 Double Calculation in the Same P E 
iv 
3 
2.3 Reconfiguution Schemes ............. . 
2.3.1 RC Cut (Row Column Cut) Method .. . 
2.3.2 RCS (Row, Columrt Slanted) Cut Method 
2.:1.3 Kuo-Fuchs Method . . . 
2.:3A Diogencs Method .. . . 
2.:1.5 Fault Stealing Methods . . . . . ... 
2.:u; CFS (Complex Fault Stealing) Method . . 
2.3. 7 FUSS (Full Usc of Suitable Spares) Method 
2.3.8 Local Redundancy Methods 
ON LINE RECONFIGURATION 
3.1 On-Line Rcconfiguration Scheme 
3.2 Implementation . . . . . . . .... 
3.2.1 Loading of Weights 
3.2.2 Handling of Partial Results ...... 
3.2.3 Switch Module . . . . , . ... ... 
3.2A Network . . . . I I t I I ... .. . 
3.2.5 Processing Element ... . 
3.2.6 Switch ..... . . . . . 
3.3 Operation of the Algorithm .. 
3.4 Concluding Remarks . . . . . . . I I I I t I I 
I t t t t 
17 
22 
23 
23 
25 
26 
29 
29 
32 
34 
35 
37 
39 
43 
47 
54 
55 
58 
61 
63 
4 ALGORITHM FOR PE AND LINK FAILURE TOLERANCE 85 
66 4.1 Data .Routing .... . .... .. ..... ... . .. , .. .. . . 
4.1.1 Vertical Oa.ta Routing Path (for PE and Link failures) . . . 66 
4.1.2 Horizontal Data Routing Path (for PE and Link failures) . . 68 
4.2 Handling of a Link failure . .. .. 
4.3 Combined PE and Link Failure .. 
v 
69 
70 
4.3.1 PE Failure (in presence of faulty Links) .. iO 
.t.3.2 Link Failure (in presence of faulty PH~) iH 
4.4 Implementation ~>I ~~ 
4..1.1 Network ~:1 
4.4.2 Processing Element :-l·l 
4.4.3 S•...-itch Module .. !JI 
4.5 Operation of the Algorithm I tll 
4.6 Concluding Remarks ...... 1 0·1 
5 Algorithm Evaluation 105 
.5 .I Analytical Results ... •••••••••••• t •• I Oti 
5.1.1 Probability of Survival After a P E Failure IOfi 
5.1.2 Probability of Survival After a /.,ink Failure IOH 
5.2 Analysis of Simulation Results ... I IU 
5.2.1 Simulation Software Outline Ill 
5.2.2 Confidence Level of the Simulation 11:! 
5.2.3 Probability of Failure . 11·1 
.1.2.4 Simulation Results I J!i 
5.3 Concluding Remarks IIX 
6 CONCLUSIONS 110 
7 SUGGESTIONS FOR FURTHER RESEARCH 122 
A Program Structure 126 
B Probability of Survival 129 
vi 
List of Figures 
2.1 Finite State Machine . . . . . 5 
2.2 Reduced Memory Interaction 5 
2.3 Systolic Array Representation 6 
2A Design I; Broadcast inputs, Results move and Weights stay . 7 
2.5 Design 2; Results stay, Inputs and Weights move . . 8 
2.6 Design 3; Weights stay, Results and Inputs move. . 9 
2.7 Matrix Encoding Method. . . . . . . . 10 
2.8 Recomputation with Shifted Operands 11 
2.9 .Multiplier with Ripple Carry Adder . . 12 
2.10 Triple Time Redundancy . . . . . . . 14 
2.11 Comparison with Concurrent Redundant Computation ( CCRC) 15 
2.12 CCRC: Comparison Schemes. . 16 
2.13 Recomputation in the same P E 17 
2.1'1 Data Flow Diagram (for recomputation in the same PE) 18 
2.15 Direct Replacement and Global Deformation 21 
2.16 Row Column Cut Method ...... . 
2.17 Row Column Slanted Cut Method .. 
2.18 Kuo-Fuchs Method 
2.19 Diogen<."S Method . 
2.20 Simplest Fault Stealing Method 
2.21 Modified Fault Stealing Method 
.. 
VJJ 
22 
24 
24 
26 
27 
28 
2.22 Complex Fault Stealing Method 
2.23 FUSS Scheme . . . . . . . . . . 
2.2·1 Interstitial Rcdund<:.ncy Scheme 
3.1 Proposed On-Line Reconfiguration Scheme 
3.2 Staging Latch Position in Normal Arrays . 
3.3 Pipeline, Before and After reconfiguration 
3.4 Modified Staging Latch Position . . . . . . 
:n 
:w 
3.5 Block Diagram of P E (with emphasis on Coefficient. Loading Cirwit) ·10 
3.6 Output Latch Block for Random Coefficient Loadiug . . . . ·ll 
3. 7 Block Diagram of P E With Two Static Coefficient Latches . •I ~ 
3.8 Basic Array with Switch Modules . ·H 
3.9 Vertical and Horizontal Data Paths •J.l 
3.10 Vertical Partial Result Handling . . .1() 
3.11 Horizontal Partial Result Handling •Hi 
3.12 Vertical Data Routing Path . . . . •!8 
3.13 Network for Vertical Data Handling during Reconfiguration . ,18 
3.14 States of Vertkal Switches (For PE failure algorithm). 11!) 
3.15 Horizontal Data Routing Path . . . . . . . . . . . . . . 51 
3.16 Network for Horizontal Data Handling during Rcconfiguration .11 
3.17 States of Horizontal Switches (For P E failure algorithm) . . . 52 
3.18 Horizontal reconfiguration for P EiJ in presence of faulty P Bil,j-1 !):J 
3.19 Horizontal reconfiguration for P Ei,i in presence of faulty P Bit,j+t 5:1 
3.20 Modified Network (for supporting P E failure algorithm) 56 
3.21 Control Lines for PE and Switches (P E failure algorithm) 5f) 
3.22 Complete Block Diagram of Modified PE (PE failure algorithm) . . .57 
3.23 Control Circuit of PE (P E failure algorithm) . . . . . .5!) 
3.24 Signal Waveforms (Output of the PE13 control circuit) 5!} 
viii 
3.25 Reconfiguration Request Propagation (for P E failure algorithm) .. 
3.26 Block Diagram of the Switch module (for P E failure algorithm) .. 
3.27 State Transition of The Switches {for P E failure algorithm) .... 
4.1 Vertical Data Path (for PE and Link failures) ........... . 
4.2 States of the Vertical Switches {For combined PE and Link failures) 
4.3 Horizontal data Path (Combined PE and Link f:!!lure) ...... . 
4.4 Horizontal Switch States (Combined PE and Link failure) .... . 
4.5 Switch State Changes (Combined PE and Link failure) ...... . 
4.6 Vertical switch State Changee (Combined PE and Link failure) .. . 
4. 7 Link failure Reconfigurations . . . . . . . . 
4.8 Various Clock Signals ........... . I I I I I I I I I I I I I I 
4.9 Network for Combinoo PE and Link Failure .•.. 
4.10 Processing Element Lines for Combined PE and Link Failure .... 
4.11 Block Diagram o£ the Processing Element (combined PE and link 
failure algorithm) . . . . . . . . . . . . . . . ............ . 
4.12 Schematic of the RR generating Circuit (combined P E and link fail-
60 
61 
62 
67 
67 
68 
68 
69 
75 
78 
83 
85 
86 
87 
ure algorithm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 
4.13 Timing Diagram of the RR generating Circuit . . . . . . . . . . . . 90 
4.14 Data and Control Lines for a Switch (combined PE and link failure 
algorithm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 
4.15 Block Diagram of the Vertical Switch (combined P E and link failure 
algorithm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 
4.16 Block Diagram of the Horizontal Switch (combined P E and link 
failure algorithm) . . . . . . . . . . . . . . . . . • . • . . . . . . . . 100 
4.17 Operation of the Algorithm (combined P E and link failure algorithm) 101 
5.1 Normal Density FUnction . . . . . . . . • . . . . . . . . . . . . • . . 113 
ix 
List of Tables 
4.1 State changes for Intermediate Stage !)() 
4.2 State changes due to 1\1 and 81 !Hi 
4.3 State changes for Final Stage !Hi 
4.4 State changes due to clock edge us 
4.5 State changes using set-reset inputs . HH 
4.6 Generation of Ao e t • I I I ns 
4.7 J - /( Hip flop inputs at t2 • 102 
4.8 J- /( flip flop inputs due to A1 itnd B1 JO:J 
4.9 J - [( Hip flop inputs at t + 2 . . . . .. 10:1 
5.1 Estimated Values of Probabilities of Survival (Array Sizc='l x 4) . I If) 
5.2 Estimated Values of Probabilities of Survival (Array Sizc=1 x 4) . 117 
X 
List of Symbols 
PB Processing Element 
1/ p Input 
O,IP Output 
1/0 Input/Output 
Horizontal input of row i to the array 
Vertical input of column j to the array 
Horizontal output of row i from the array 
0~ ] Vertical output of column j from the array 
Static Coefficients 
pg. I,J P E with physical index ( i, j) 
P E with logical index ( i, j) 
It 
X Input of P Ez: at timet (when only one input is there) 
at 
X Output of P Ez: at time t (when only one output is there) 
Ill t r I Horizontal input of PEr at time t 
/ 1' t r I Vertical input of PEr at time t 
0 11 t 
r ' 
Horizontal output of PEr at time t 
0\' t 
r 1 Vertical output of PEr at time t 
Jlf P/~Ot Jlf 1'E11 Jf!E2 Horizontal input ports of P E 
•\' !\' lpf;o, PEl Vertical input ports of P E 
0 11 au au 1'E01 PElt PE2 Horizontal output ports of P E 
xi 
Vertical output ports of P E 
w·. 
'•J Static coefficient (weight), corrcspomling t,l) JJ /~i,J 
S· . I,J Switch module with index (i,j) 
S!l. 
t,} Horizontal switch module with index (i,j) 
sr. 
1,] Vertical switch module with index (i,j) 
[II 
SOl Ifft' !£, I£ Horizontal input data ports of switches 
I¥o, IXp IX2 Vertical input data ports of switches 
au 
SO' 
au 
Sl' O'j2, 0!/3 Horizontal output dat.a ports of switches 
oro, o¥., o¥2 Vertical output data ports of switches 
£Y 
:z: Link, connecting the output of x to !I 
RRY :z: Reconfiguration request, generated by x and fed l.o !J 
CLJ(pE Global clock to the PEs 
CLI<s Clock pulses to the swit.ches 
State x of the horizontal switches 
State x of the vertical switches 
FP Fatal Failure 
Error in logic circuit 
SPE Signal to the spare cells 
Link failure signal, failure dctcct.cd by x a.mJ reported l.o !J 
ErH Error in horlzontal input 
Erv Error in vertical input 
xii 
Chapter 1 
INTRODUCTION 
Von Neumann architecture restricts the speed of a memory based hardware system 
because of the limited number of interconnections. The system speed can be in-
creased by reducing the number of memory interactions. Systolic arrays accomplish 
this and thereby improve the system performance. Here, the interaction with the 
outside world occurs only at the boundary cells of the array and once the data are 
fed to the array, the intermediate results are not passed on to the memory devices. 
A systolic array is an array of similar processing elements, where every element 
performs th~ same basic operation. 
Systolic arrays can be classified under various categories depending on the data 
flow inside the array. The most common type is that of moving result, static 
weights. In this type of arrays, the partial results move in a pre-specified way and 
the weights stay in the processing elements. The various types are described in 
Chapter 2. 
These arrays have a number of similar processing elements, so some spare cells 
can conveniently be introduced. In the case of a cell failure, the spare cell can 
replace the faulty one, thus improving the system reliability. 
Various reconfiguration schemes (described in Chapter 2) have been proposed 
for the reconfiguration of th~e arrays in the event of a fault occurrence. Most of 
the proposedreconfiguration schemes use hardware redundancy (spare cells) and in 
1 
I 
•· 
·. 
; . 
' ~· · 
c· 
I' , .
. .
t. 
~ 
\ 
I 
f. 
J: 
I 
~· 
' i 
(. 
! 
case of a fault detection, the reconfi.guration algorithm is performed on the array by 
a.n external central processor (which maintains information about the operational 
effectiveness of processing elements). The external processor is capable of changing 
the data routing. The rcconfiguration algorithm changes the data routing paths 
and makes the array operational if the algvrithm is successful. The reconfigured 
array is flushed to clear the partial results and the array can then be used again. 
Since the array is flushed after every fault occurrence, these reconfigura.tion 
a.lgotithms cannot be used effectively during run-time. In addition, these algorithms 
assume a. fault free routing network, which is difficult to achieve. These two major 
shortcomings restrid the use of the above algorithms to production time yield 
improvement. 
An on-line reconfiguration scheme should preferably be capable of utilizing those 
partial results which were not affected by the fault occurrence (referred to as non-
faulty partial refmlts in this report). In addition, the faulty partial results should 
be blocked by the algorithm to ensure the proper operation of the array. 1£ a faulty 
partial result is passed on to the next processing element, it would make the final 
results erroneous. 
An on-line reconfiguration algorithm is presented in this report which accom· 
plishes the above-mentioned tasks and, in addition, tolerates switching network 
failures. This algorithm requires an additional row of processing clements, called 
spare cells (and this row is the bottom most row of the array). When a processing 
element failure is detected, the spare cell of the corresponding column is used lo rc· 
place the faulty cell. Similarly, redundant links are provided to ensure the tolerance 
of link failures. 
The processing element and switch modules are redesigned to accomplish the 
generation of above system. Each processing element performs a self-test and in· 
vokes reconfiguration (by generating reconfiguration requests) when it detects a self 
2 
fault. 
It is assumed that a central processor is linked to the array, which is capable 
of controlling the clock pulses to the array. In the event of a detected failure, the 
central processor is informed about the failure and it delays the clock pulses (as 
wilJ be explained in chapters 3 and 4). 
The processing element which detects the fault informs the neighbouring pro-
cessing elements and switches about the fault occurrence. These neighbouring ele-
ments and switches generate reconfiguration requests again (if required) and inform 
the other elements and switches. The reconfiguration request keeps on propagating 
in this manner until it reaches the central processor. 
1.1 Thesis Organization 
The thesis is divided into six chapters. This chapter gives an introduction to the 
research topic. Chapter 2 gives an overview of systolic designs and explains various 
existing reconfiguration schemes and fault detection schemes. In this chapter it is 
shown that most of the existing reconfiguration schemes cannot be used effectively 
during run-time. An on-line rcconfiguration algorithm for processing element fail-
ures is proposed in Chapter 3. In addition, chapter 3 describes the changes (in 
the design of processing elements, network and switch modules) required for the 
implementation of this algorithm. In this chapter it is proved that the recom· 
mended changes are sufficient to facilitate the reconfiguration. Chapter 4 explains 
the reconfiguration algorithm for failures in processing elements and links. Various 
control circuits (for PEs and switches) are designed in this chapter. The proposed 
algorithm is evaluated in Chapter 5 and conclusions are presented in Chapter 6. 
3 
Chapter 2 
LITERATURE REVIEW 
In this chapter, Section 2.1 explains the basic concept of systolic arrays. Section 2.2 
describes various fault detection schemes for these arrays a.nd Section 2.3 gives a 
summary of various well-known reconfiguration schemes and compares them. 
2.1 Concept of Systolic Arrays 
When memory-based hardware is used, typical Von Neumann bottleneck comes 
into the picture because of the limit~d number of interconnections which can be 
supported by conventional electronics based technology. In memory-based systems, 
the memory-access time restricts the speed of the system. This can be further ex· 
plained with the help of the classical finite state machine, show!l ir. Figure 2 .l.a. 
This machine consists of several storage elements, M, a logic unit, inputs (1/P), 
outputs (0/P) and interconnections. In this scheme all the memory elements am 
updated and/or read simultaneously in parallel without any addressing. This con· 
figuration is not feasible if the number of memory elements is large. So addressing 
is used to reduce the requirement of parallel lines. This scheme is shown in Fig· 
ure 2.1.b. Here additional address lines are used and data is fetched in paraHel to 
all the memory elements by a bus and, similarly, a bus carries the output from the 
memory to the logic circuit. 
Here, the number of required lines is reduced but the system performance has 
4 
LOGIC 
I/p CIRCUIT ~-----~ O/P -~----.Jr--+ 
a b 
Figure 2.l: Finite State Machine 
MEMORY 
Figure 2.2: Reduced Memory Interaction 
degraded because now only one memory element can be addressed at a time; as well 
an address is required to access the memory locations. This results in an increased 
memory-access time [1]. 
The Von Neumann problem can be solved by reducing the number of memory 
interactions. To explain this, we will consider a processing element which requires 
at least two memory interactions per operation. If memory access time is 100 ns, 
we would get a maximum of 5 million operations per second (assuming that the 
processing element takes negligible time compared to the memory access time). 
But, if the data are returned to the memory after n such operations, the speed 
becomes 5n miltion operations per second (see Figure 2.2). 
All computations can be classified either under the compute bound category 
5 
(- PEs 
, 
Jf_ 
Figure 2.3: Systolic Array Representation 
or under the 1/0 bound category of computations. Compute bound computations 
involve more computations than the required I/0 operations (such as matrix-matrix 
multiplication). In 1/0 bound tasks, the number of computations is less than the 
1/0 requirements (such as matrix addition). 
Systolic arrays reduce the number of memory interactions. In a systoHc array, 
once data are taken out from the memory, they arc pumped through a number of 
processing elements before the final result goes back to the memory. The flow of 
data. in a. systolic array resembles the blood flow in the body and the term systolic 
shows the analogy with cardio-vascular biological system. The term array is U!!cd 
to show the resemblance of the systolic array to a grid, as shown in Figure 2.3, in 
which each. junction point represents a processing element and the lines represent 
the links between the processing elements. 
SystoUc arrays consist of a set of interconnected processing elements, each cl-
ement capable of performing some basic operations. The data flow in a pi pelined 
manner within a systolic array and communication with the outside world occurs 
only at the boundary cells [2]. The memory requirement is reduced because the 
intermediate results are not passed on to the memory. Other than the reduced 
memory requirements, we get the following advantages: 
• modular expandability 
• regular and simple data. flow 
6 
B ~output 
Figure 2.4: Design 1; Broadcast inputs, Results move and W~ights stay 
• use of simple and uniform cells. 
Systolic arrays can be of many types (the types are defined based on the movement 
of data. through the array). Some of the basic types are discussed in the following 
subsections. Consider a simple computation, given below: 
(2.1) 
where w's are the pr~rspecified weights and x's are the input data sequence. Here 
we would tnke k=3 for simplicity. So, 
(2.2) 
Many types of systolic arrays can be designed to accomplish this task [2). 
2.1.1 Broadcast inputs, move results and weights stay 
The systolic array with this design and basic eel! operation are shown in Figure 2.4. 
In this design, one of the basic criteria of systolic designs is not satisfied, still it 
works on the same pr~nciple that the intermediate results are stored in the array 
itself. The inputs are broadcast to all the cells at the sa.me time, which is not 
acceptable for systolic arrays. Due to this shortcoming, this design is classified as 
semi·systolic design. 
7 
w~ 
~Zin Zoul 
J, I 1 
,..Y..., ,.-'i.-, ,..v ... 
I t----JII t-• · -- 1-~ 
I. • • .I I- - - . I L - - .I 
Wouc =Win 
Figure 2.5: Design 2; Results stay, Inputs and Weights move 
The data move at the tick of the clock pulse. The data present at the check 
points A, Band output (shown in Figure 2.4), with reference to the clock arc lislcd 
below: 
CLK. A B Output 
0 Wt:Z:l W',tXt 1L'3Z1 
1 Wt:Z:IJ WtZJ + w2.r2 W2Zt + W3Z2 
2 WtZ3 WJZ2 + W2Z3 WtZJ + W:z%2 + W3Z3 
3 WtZ4 WtZ3 + W2Z4 WtZ2 + W:z%3 + W3Z4 
4 WtZS WtZ4 + w,xli WtZ3 + W:zZ .. + W3Z5 
5 WtZ6 WtZs + W2Z6 WtZ4 + W:zZs + W3Z6 
6 WtZ7 WtZ6 + W2Z7 Wt Z5 + W:zZ6 + W3Z7 
We notice that from clock 2 onwards we get one correct output per clock cycle. 
There are many variations of semi-systolic designs but for the sake of brevity we 
will not discuss them here. 
2.1.2 Results stay, inputs and weights move in opposite 
directions 
This design is shown in Figure 2.5. 
It is difficult to implement the previously explained semi-systolic design if the 
number of cells is large, because of the global broadcast bus requirement. 
In this design, consecutive z's and w's are separated by two clock cycles to get 
the proper results. When Zi a.nd Wi meet at a cell, the cell multiplies them and 
8 
Y1 
Y~c Yin w, 
Zin Zouc 
Yout = Y. + Wj ,X in 
tn 
Figure 2.6: Design 3; Weights stay, Results and Inputs move 
adds the result to the previously stored result. When w1 reaches a cell, it outputs 
the stored values in the cell to the latch (shown below the cell in Figure 2.5) and 
resets the cell before getting multiplied by xi. Here the path (shown by broken 
lines) is used for collecting the final outputs. 
Usually, the results a.nd inputs move and the weights stay in the array. This 
type of array is discussed next. 
2.1.3 Weights stay, results and inputs move in opposite 
dir-aciio~fJ 
This array is shown in Figure 2.6 and here the results and inputs move systolically 
in opposite directions. This type of design is most suited for on-line arrays and it 
is used when the same set of coefficients is used to operate on different input data 
(for example: recursive filtering, polynomial division etc.). 
The other types a.re not discussed here for the sake of brevity. 
Systolic arrays can be used for a number of processing operations. These ar-
rays ensure multiple computations per memory interaction. They are particularly 
suited for FIR, IIR filtering, convolution operations and various matrix operations, 
like matrix transpose, matrix vector multiplication, matrix matrix multiplication, 
matrix inversion etc. [2) to (5). These arrays can be used for any compute bound 
problem, which is regular (that is, one where repetitive computations are performed 
on a large set of data.). 
9 
[A] X [n] [c] 
- -c c 
h h 
t! t 
c e A X B k c k 
• • u u 
m m 
'-
.__ 
c:heckawn checbum 10 
Figure 2.7: Matrix Encoding Method 
2.2 Fault Detection Schemes 
Systolic arrays are almost always designed to perform special purpose computations, 
so algorithm based fault detection schemes can be applied to them with very little 
hardware and time overheads. Some fault detection techniques are discussed in the 
following subsections. 
2.2.1 Matrix Encoding Methods 
In this scheme, the matrix is encoded by adding some checksums. Consider the 
matrix·matrix multiplication shown in Figure 2.7. During encoding, a checksum 
row is added to matrix A and a checksum column is added to matrix B. After the 
multiplication, the result matrix, C would have a checksum row and a check:mm 
column. An example of this is given below: 
3 2 4 X 2 4 6 = 19 [ 
2 4 1 1 [ 1 2 { 3 }] [ 13 
{5 6 5} j 3 1 4 {32 
Here, the checksums are shown in curly brackets. 
This method can be used for those matrix multiplication arrays where the re-
sults stay. An error is detected by t:hecking the checksums and it is located at 
the intersection of the inconsistent row and inconsistent column. For an n x n 
multiplication, an (n + 1) x (n + 1) array is required (overhead of (2n + 1) cells) [6]. 
10 
z-----~ {at to) 
z-(at It) 
Figure 2.8: Recomputation with Shifted Operands 
2.2.2 Recomputing with Shifted Operands (RESO) 
Though many oodes are available for concurrent error detection in addition a.nd 
subtraction arrays, they cannot be used for multiplier and divide arrays because 
they unduly increase the complexity of the circuit. 
For such arrays, RESO is an efficient scheme. The basic concept of this scheme 
is shown in Figure 2.8. First the function f (which is the required operation on the 
operands) is performed on the data x and the result is stored. The data z is then 
encoded by c and f is performed on this encoded data. The final result is decoded 
by c-1 and the decoded result is compared with the stored result. Any mismatch in 
these two values shows the error [7] (8). Here, the coding cis performed by shifting 
the operands. 
If many operands are used as inputs, then it may not be possible to shift all 
the input operands equally. In this case the operands can be assumed to be shifted 
by k1, k1 , k3 ••• and the result obtained by using these shifted operands would be 
shifted by r bits. This scheme is known as RESO (k1,k2 , ... ,r) [9). 
If Eo and Et are the set of all possible erroneous outputs of !F(:t) and JF(x) 
respectively due to a fault F after the computations, where J(x) is the required 
function, then the errors are detectable iff Eon E 1 = t/J (which means that any 
possible output of the repeated step, /f.(x) must not be an element of E0 ). 
The potential error set (as explained in [8)) of the first unshifted computation 
11 
:to Y'l Xt Yo l'o Yt xo Yo 
0 
Figure 2.9: Multiplier with Ripple Carry Adder 
can be written as: 
Eo= {±2i x qjq = 1,2, ... ,u}, 
where i is the minimum of the bit slice index of the fault mouulc artd u is the 
maximum error factor (which reflects the integer value of the affcd<!d output. bit 
due to the fault). To make it more clear, we can consider the circuil. shown in 
Figure 2.9. 
Here, when one.. adder cell i fails, it tries to change the value of the output. The 
ith adder chip failure can result in an error in the ith sum bit or the carry hit (which 
affects the (i + l)th bit). So, there are two possible bits which can be affcded aiUl 
these bits have weights 2i and 2i+l. This gives four possible cornbinat.ious: 
1 both bits correct; error = 0, 
1 bit 2i has error, 2i+1 correct; error = ±2i, 
• bit 2i+1 has error, 2• correct; error = ±2i+t, 
12 
So, the result is in error by any one of the error set {0, ±2', ±2i+l, ±3 x 2'} 
So, for an adder, u=3. In the earlier discussion, we neglected the element 0 of 
the set, because this identifies a correct output. 
In the recomputation step, the result is shifted left by r bits with respect to the 
original unshifted result. So potential error set of the recomputation is: 
E1 = {±2i-r X qfq = 1,2, ... ,u}. (2.4) 
Now, the disjointness of E0 and E1 can be ensured by making sure that the 
maximum element in E1 is less than the minimum element in E0 • l 
"' 
Using this strategy, arrays can be designed, whose faults can be diagnosed by 
RESO method [9]. 
2.2.3 Triple Data Redundancy 
This scheme uses the basic modular property of systolic array (that all the process-
ing elements, PEs, perform the same operation) to detect {a.nd mask few) errors. 
It is suitable. for one dimensional arrays. 
In this scheme, three PEs perform the sa.me co:nputation on the same data at 
a time and they pass on their results to the next 3 PEs, which compare these 3 
results and then perform the computation on the majority-voted input. Since this 
scheme uses three PEs to perform the same operation on the same data, it can 
mask the presence of a single fault and detect double faults [10]. 
The input is given to PEt, P E2 and PEa simultaneol!:;ly (Figure 2.10) and 
they perform their portion of work on this data and then 
• PEt passes on the result to P E2, P E3 and P E4, 
• P E2 passes on the result to P E3 and P E,. and 
• P E3 passes on the result to P E2 and P E4• 
13 
I . 
.. 
Figure 2.10: Triple Time Redundancy 
So, at the next dock pulse, P E2, P E3 and P E4 get three identical inputs (if no 
fault is present) and each one of them votes on the data and then performs the 
computation on the voted data. 
In the case of a detected error, the P E which detects the error informs the 
central processor that the data output from P Ez is wrong. After receiving this 
messa.ge, the central processor attaches a flag (indicating a fault in the P E) to 
P Ez and ignores any further information about PEr's health. In add ilion, the 
central processor maintains a table of the health of all P E's. Whenever it receives 
an error message, it checks the table and if the faulty P E falls within a distance 
of 2 from another faulty P E, reconfiguration is done by removing 3 PEs from the 
array. Each reconfiguration removes three PEs from the array. If in the array, 
shown in Figure 2.10, all PEs are working properly initially and then P En-2 fails, 
the central processor marks it in the table and next if either PEn-t or PEn fails, 
the reconfiguration removes P En-2, PEn-t and PEn from the array. Similarly, if 
in this case (with P En-2 as first failure) either P En-3 or P En-4 fails, the reconfig-
uration removes P En_4 , P En-3 and P En-'2 from the array. So a reconfiguration 
removes exactly 2 faulty cells and 1 non-faulty cell from the array. The reconfigu-
ration reduces the array size and this necessitates a restructuring of the algorithm 
executing on the array. So, after every reconfiguration, the full array is flushed and 
the restructhring algorithm is run. This is done by the central processor [10). 
14 
input ••• ~Output 
Figure 2.11: Comparison with Concurrent Redundant Computation (CCRC) 
2.2.4 Comparison with Concurrent Redundant 
Computation ( CCRC) 
This scheme can be used for those systolic arrays, in which the results move and 
t.hc weights stay in the PEs [11]. Here, the same computation is done by P Ei and 
P Ei-l at the same time and the results are compared (Figure 2.11 ). This algorithm 
assumes that only one P E fails at a time, so if P E; is faulty, it will be detected by 
comparing Yi and Yi-1· 
To implement this, the same input is given to the array twice. This can be 
done in many ways. One of the methods is shown in Figure 2.11. Here, PEE is 
t.he extra. P 15, which is used to introduce the proper delay and calculate the first 
partial result. 
The comparison can be done in two different ways. The scheme, shown in 
Figure 2.12.a, assumes that even when P E; is faulty, its comparator is working. 
This condition is difficult to achieve. The scheme, shown in Figure 2.12.b, docs not 
assume this, but it requires an additional link between the PEs. 
This scheme generates an asynchronous error signal, which is necessary. In 
this case th~ fault is detected even before the error propagates to the output and 
corrupts the next stage of the system [11]. 
2.2.5 Double Calculation in the Same PE 
This scheme is suitable for the systolic arrays, where the results stay in PEs and 
the coefficient and data streams move [12]. In such systolic arrays, the partial 
15 
------------ ------------
------------ ------------1 I I I 1 I I I 
1/P: I 
1 
~----"1E- CI'I'Or 
L.~------ - --J 
a b 
Figure 2.12: CCRC: Comparison Schemes 
results stay in the PEs and when final result becomes available, it is passed on to 
the output register from where it is scanned out. 
Now, consider an FIR filter: 
where N is the number of PEs. 
N 
Yn = ~ ajXn-i, 
i=O 
(2.5) 
To implement this filter, we have to separate the adjacent coefficient terms 
and data terms by two cells. This cell separation feature can be used to get fault 
tolerance. To add this additional feature, some extra. hardware is required. Without 
fault tolerance, the normal P E looks as in Figure 2.138.. A second accumulator is 
added to store the results of a second calculation, Figure 2.13.b. Each accumulatf;r 
RA, feeds the adder and accepts its output during alternate clock cycles. So two 
independent calculations can be performed in each P E. When the calculation of an 
output term is completed, the adder output is sent directly to the output register 
Ro, while the accumulator containing the parallel result is reset. 
Data flow is shown in Figure 2.14. Two boxes ~~ore shown for each P E and the 
coutent of each box represents the content of an accumulator in the P E. In the 
figure, ij means Xi.a;; for an example, 32 would mean X3.a:z. 
16 
----------------------1 
I 
1,.. 
I I I I 
I I I I 
~-------------------1 L--------------------l 
a b 
RD : Data resJater 
RA : Accumulator 
Rw : Weight regiater 
Ro : Output Regiater 
Figure 2.13: Recomputation in the same P E 
It is clear from this flow diagram that every output is available from two different 
PEs. These outputs can be compared to detect a fault. 
Here, it is not possible to locate the faulty P E, because only 2 copies of the 
result are available, but whenever a fault is detected, the faulty P E can be located 
by running some exhaustive checking algorithm. 
2.3 Reconfiguration Schemes 
Fault tolerance is incorporated in a systolic array to achieve two basic goals: 
• to improve the system reliability and 
• to improve the yield of VLSI and WSI chip production. 
To improve the chip denRity it is required that the physical dimensions of the 
transistor level circuitry be reduced making the manufacturing process more error-
prone. 
17 
l=t ~Zt J I. [j I.~::· ill11.J.,, 
I=• ~ Xt 11 0 I. :I 11 0 I.:· t I 0 [ :: J Ji3 [., , 
l=
3 ~ x, ll. t:' ll J :~ t I J ~.l J~; t :: , 
I=· ~J 1~:: lJ.t:: J~IJ:~ J~t~t:: ~ 
xz X 1 X · I 
... 
t=s 11 
a3 22 a2 12 22 a2 12 02 
- "• 23 33 23 13 03 13 03 -13 ~ 
X I 
.. "?' 
21 21 II 
32 a2 32 22 a2 12 22 a• 12 02 a1 
---
23 33 23 13 03 13 03 -13 
Xt 
20 ~>-
31 21 31 21 II 
32 42 
a2 32 22 al 12 22 UJ 12 02 ao 23 33 23 13 03 13 03 
-13 
l-
X 4 X 4 X 3 3 ·2 
30 30 
t:s 41 41 31 21 31 21 32 42 a, 32 22 at 12 22 ao 12 ao 23 ~'13 23 13 03 13 03 0 
-- -L_y3 -r YJ 
X 4 X 4 X 3 X 3 
<10 •10 ~ 
l=9 41 51 41 31 31 32 42 a1 32 22 ao 22 ao a3 23 33 23 13 0 13 0 33 
'Lt 1~Y4 
xs 5 X 4 X 4 X •3 
50 50 
r 41 51 41 32 42 ao 32 1-E-ao a a 33 La:~ 23 33 23 0 43 0 43 
l=to 
I Ys I y 5 
Figure 2.14: Data Flow Diagram (for recomputation in the same PB) 
18 
For a typical bulk CMOS process, the following is a brief list of common defects: 
• Photolithography Defect: it causes missing or extra patterns on a mask layer. 
Common sources of this are mask defects, dirt particles and uneven etching. 
• Contact and Via Defects: these are the windows between different layers for 
providing interlayer connections. The defects in these can result in shorter /larger 
window area causing the shorting of other connections. 
• Gate Oxide Defect: charge trapping in gate-oxide regions of MOS devices 
results in threshold voltage shifts which can lead to reduced noise margins 
and malfunctioning of gates. 
Because of these reasons, the production of VLSI/WSI chips does not always give 
a. yield at an accept~ble level. To improve the yield, the chip is designed to be 
fault-tolerant (13]. 
To achie\'e fault tolerance we have to provide redundancy, which can be of two 
types: 
• hardware redundancy: in this case, spare cells and the corresponding inter-
connection network are provided and in the case of a fault, reconfiguration is 
done. 
• time redundancy: here, the processing elements are provided a number of 
processing states. Working elements perform the functions of faulty cells if 
any fault occurs. In this case, the number of elements does not increase 
but the interconnection network becomes very complex. Also, the processing 
speed decreases drastically, so it is not suitable for systolic arrays. 
Usually, harJware redundancy is provided in an array and in case of a fault, re· 
configuration is done. The goal of the reconfiguration is to achieve 100% spare 
19 
•· 
'· 
utilization (i.e. if N spare cells are available, the array should survive up to N 
faults). 
In discrete system architecture, 100% spare utilization is possible and also de-
sirable because here the cost of the processing element is much higher than that of 
interconnection network and usually in this case each processing element is a CPU, 
so the re-routing can be performed by one of the working PEs. If the CPU iM an 
extremely simple device (which can not perform the re-routing), the reconfiguration 
is not needed because in this case the reliability of the system wi\1 be extremely 
high due to the simple CPU design. 
In the case of a systolic array, though the utilization of spare cells is still im-
portant, it is also necessary to maintain the locality of interconnections. Here, it 
is essential to use simple routing devices to minimize the time delays and silicon 
area. (it has been proved that excessive increase of chip area due to fault tolerance 
related circuits has a negative effect on overall device reliability). 
So, for a systolic array, the reconfiguration process has to provide a compromise 
between the reconfiguration-effectiveness and algorithm complexity. This compro-
mise depends on the approach adopted for the reconfigura.tion, namely: 
• static reconfiguration, performed at production time, 
• dynamic reconfiguration driven by a host computer at run time and 
• dynamic reconfiguration, performed on-chip at run time. 
Static reconfiguration is uniquely determined at production time and . for this 
the testing is performed externally (so no on~chip control circuitry is required). The 
complexity of the reoonfiguration algorithm is not a critical issue because it docs 
not affect either circuit complexity or operation speed. 
For the second case, it is assumed that the external host can perform rcconfiguration· 
controlling actions on the basis of the available error information. 
20 
1/P. 0/P 
DIRECT REPLACEMENT GLOBAL DEFORMATION 
Figure 2.15: Direct Replacement and Global Deformation 
The third case introduces additional costs for self testing and self reconfigura-
tion. 
For all the dynamic reconfiguration algorithms, the problem of error-latency 
(defined as the time that passes before the array is operational again after the oc-
currence of a fault) has to be considered. Any reconfiguration approach involves 
two problems: the first problem is that of routing data. through the rec"nfigured 
array. It involves introduction of redundant links and routing devices. The local-
ity of the interconnection network is maintained by using the global deformation 
technique in place of the direct replacement technique. In the global deformation 
technique if cell i is faulty (see Figure 2.15), cell (i+ 1) assumes the role of cell i and 
cell (i + 2) performs the functions of cell (i + 1) and so on. The spare cell performs 
the function of cell N. In the direct replacement technique, the spare cell has to 
perform the function of cell i and this disturbs the uniform data flow assumption 
of the systolic array. 
The second problem is that of the reconfiguration computation as related to fault 
distribution. It involves the implementation of the reconfigura.tion algorithm [14). 
An M x N faulty array is said to be reconfigurable into an m x n array iff m 
horizontal and n vertical data flow paths can be achieved by reconfiguration. 
There are two major types of reconfigura.tion schemes: 
• Set Switching Schemes: here, a faulty cell is replaced by logically removing a 
set of cells (row, column, block etc.), that contains the faulty cell. It is easily 
21 
C-cut C-cut 
I I 
R-cu\ --€}--$-- -{I)--8-
I 
I 
0 <:> ~ 0 
I I 
0 <P $ 0 
I I 
0 ~ Q 0 
I 
I 
0 : GOOD CELL ® : FAULTY CELL 
a 
h; +h· 
Vj 
Losic Circuit 
b 
Figure 2.16: Row Column Cut Method 
implemented but the waste of non-faulty cells is laz:;e. 
h., 
L: Latcbes 
• Processor Switching Schemes: here the replacement scheme proceeds in a 
chain fashion such that a faulty cell is replaced by (shifted to) an immediate 
neighbour and so on until the spare cell is reached [15]. 
Various reconfiguration schemes are discussed in the following sub-sections. 
2.3.1 RC Cut (Row Column Cut) Method 
A cut is defined as a set of cells, such that bypassing them leads to an array with 
one less data. flow path. A horizontal (vertical) cut remov~ one horizontal (vertical) 
data. flow path from the original array. Horizontal (vertical) cut is also called row 
(column) cut. 
In this method, for a. faulty cell all the cells in the same row/column arc taken 
to be in a cut. So, in the array, shown in Figure 2.16.a, one horizontal and two 
vertical paths are involved in-cuts. This results in a reconfigured 3 x 2 array from 
a. 4 x 4 array. 
22 
The routing arrangement is shown in Figure 2.16.b. It is clear from the figure 
that any cell can be bypassed by simple switch controls. The architecture and path 
generation are simple but this algorithm wastes a large number of non-faulty cells. 
Particularly, for a large array (suppose a 10 x 10 array), the failure of just one cell 
removes a large number of cells (in this case 10) from the array (16]. 
2.3.2 RCS (Row, Column Slanted) Cut Method 
This is also known as Kung and Lam approach. Here, the cells contributing to a cut 
may not be from the same row or column but they satisfy the foliowing conditions: 
• a cut must contain one cell per row (vertical cut) or one cell per column 
(horizontal cut) and t,!le slope of the line containing the cells in the cut must 
be non-negative ~nd 
• the inclination of the line connecting the cells in the cut between the successive 
columns must be 0 or 45 degrees for horizontal cuts and 90 or 45 degrees 
between successive rows for vertical cuts. 
One such vertical cut is shown in Figure 2.17 .a. Here, the 4 x 4 array (used as 
an example in RC-cut subsection) is reconfigured into a 4 x 3 array. The routing 
arrangement for an RCS cut is shown in Figure 2.17.b. It is clear that the utilization 
of cells is improved in this method, but the routing complexity is ;,t,}so increased. It 
is difficult to get an optimum cut in this method and for fewt~r faults this scheme 
also wastes a large number of non-faulty cells [16). 
2.3.3 Kuo-Fuchs Method 
Now, consider the (7 +2) x {9+3) array shown in Figure 2.18.a. Here (7+2) x (9+3) 
means that it is a 7 X 9 array, having seven rows, R1 through R1 and nine columns, 
C1 through C9, with two spare rows, SR1 and Sm, and three spare columns, Sc1, Sc<J 
and Sc3). Only faulty PEs are shown in the figure. 
23 
h, v, 
C-5 cut 
h2 
*"' v2 . v,, I 0 0 0 ~ ho Vo hl h2 Vt v2 ,.. 
0 0 ~ 0 
I 
0 0 $ 0 
;-' 
... --
0 ~ 0 0 
Logic Circuil 
0 : GOOD CELL l8) : FAULTY CELl, 
R1 
R2 
R3 
R4 
Rs 
Rs 
R1 
SRi 
SR2 
a b 
Figure 2.17: Row Column Slanted Cut Method 
C!t C!2 Ca q4 C:li ~6 C,1 C,s C? Sc,t S~2 5_ca 
0 • • 0 • • • • • • • 
• • • • 0 • • • • ' 0 
...... '"®· .. : .... : .. ·® .. - ~ .... ~ .. . - ~ .... : ... ® ... : .... : .... : ... ... .. .. 
• • 0 • • • • • • • • • 
: : : : : : : : : : : : 
··· ··· ····· ··· ·· ····· ·· ········ ··· ··· ····· ········ ··· ······· ·· ····· ····· ·· ·· • • 0 • • • • • • • • 
• • : • 0 • • • • • 0 
..... .. ·®· .. : .. .. : .. .. ~ .... ~ .. .. ~ .... : .. .. : .... ~.' .. : ... . : ... . ~ ......... . 
• • • • • • • • 0 • • • 
... . .. . . . . ~ .... ~ - .. ~. ·$ ... ~ ... - ~ · · ·$···: .... :···. ~-· .. : .. . ·[" ... .... . 
o o I I 0 o 0 I 0 0 o o 
0 o I o o 0 I 0 0 o o 
····· ·· ········· ·· ··· ·· ... ..... ..... .... .. .. ..... .... .......... ....... ... ... . 
. . . ' . . . ' . . . . 
0 0 0 o 0 o 0 I 0 o o 0 
: : : : : : : : : : : : 
··· ··· ·· ····· ··· ·· ······ ··· ····· ·· ·· ··· ···· ··· ···· ·· ·· ·· ········ ········ ·· ·· • • • • • • • ' • • • 0 
. . : . . . ' ' . . . 
. . . '' '' .. . : .... ~ . ' .. : .. ·!8!· .. : .. .. : .... ~ ... ': .. ·!8!· .. ~ ... . : .... ~ ... ...... . 
• 0 • • • • • ' • • • • 
• • • • • • 0 • • • • • 
• • • 0 • • ' ' • • • 
o o , o o o o o o o o o o o o o o o o o I o ~ o o o o , o o • o o o o , o , o , o o • o o o o o o o o o o o o o o o o o o o o , o o o I • • • o • o o • 
. . . . . . . . . . . . 
• • • • • 0 ' • • • • 
o o o o o o o o o I o • 
• • • 0 • • • • • • • • 
· ··· ·· ····· ··· ········ · ······ ••o •• ·· ·· ·· ·• •oooooooo•• · · ·· ·· · · ·· ·· · ······ ··· · 
• • • • • • • • 0 • • • 
. . . . . . . . . . . 
: ! ; ; ; ; : : : ! 
a 
Figure 2.18: Kuo-Fuchs Method 
24 
J, : l.atchr.~~ 
b 
A general set replacement algorithm replaces faulty rows/columns by proceeding 
from left to right and top to bottom- so, rows 1 and 3 would be replaced by the spare 
rows and columns 3, 4 and 7 would be replaced by the spare columns. Obviously, 
it docs not. reconfigure the array. 
In the Kuo-Fuchs method, the rows/columns that contain the maximum number 
of faulty cells arc replaced first. To implement this, the array is modelled as a 
bipartite graph, whose two sets of nodes are array rows and columns that contain 
faulty cells. Edges of this graph refer to the faulty cells. The bipartite graph of the 
(7 + 2) x (9 + 3) array is shown in Figure 2.18.b. 
This method first chooses the nodes with maximum number of branches and re-
places them. II ere, first R1 and R4 arc replaced with spare rows and then C1, C4 and 
Co are replaced with spare columns. This achieves a successful reconfiguration [17). 
ln all the above-mentioned schemes the utilization of non-faulty cells is very 
poor. Next, some processor switching schemes are discussed, where an available 
spare cell directly or indirectly replaces a faulty ceil. Because of this, for these 
methods, the rcconfiguration efficiency is good. 
2.3.4 Diogenes Method 
In this approach the array is laid out in a line with bunches of wires, called bundles, 
running above the line (the PEs need not literally lie in a line), as shown in 
Figure 2.19. 
Each P E has some number of lines entering it (connecting it to the PEs, that 
lie to its left in the line) and some number of lines leaving it (connecting it to the 
PEs, that lie to its right in the line). These entering and leaving sets of lines are 
connected to the bundles through switches that are set by external control. The 
PEs arc scanned in a row and the faulty PEs are not connected to the bundle. 
So, the utilization of the spares is maximized. 
25 
Figure 2.19: Diogenes Method 
D CELl. 
FAUt;t'Y 
C:ELI. 
D SWITCII 
S\\'ITCII 
(open) 
In this method, the PEs are tested first and the outcomes of the tests n.rc 
available to the buses via control lines GOODi that mdicatc the presence or absence 
of fault in the ith P E. If P Ei is fault free, the corresponding control line would 
be high and P Ei would be hooked to the bundle. A P B is hooked to the bundle 
only if the corresponding line, GOODi is high. This feature facilitates the lcsti11g 
also. Any P E can be isolated and tested by setting its GOO Di line to '1' and other 
GOODi lines to '0'. 
This scheme requires a large silicon area for the switch bus that might ilsdf fail. 
In the presence of consecutive faulty PEs, logically adjacent PEs can be far apart 
physically, reducing the system speed [18]. 
2.3.5 Fault Stealing Methods 
These are also known as index-mapping schemes. Here, for an array of M x N 
cells, the spares are organized along the (M + l)th row and the (N + 1)1h column. 
Reconfiguration is performed by mapping the array functions ont.o t.hc working 
cells by means of a global renaming process. Whenever a given algorit.hm docs not 
complete this mapping onto correctly working cel1s, a fatal failure condition is said 
26 
Figure 2.20: Simplest Fault Stealing Method 
to occur. 
In this scheme, the physical and logical indices arc defined first for each cell. 
The physical indices ( i, j) denote the position in the physical array consisting of all 
cells and the logical indices (i',j') denote position in the logical array consisting of 
working cells only and implement all the functions required by the array. 
Consider the simplest case, in which a spare column is added. If a cell (i,j) is 
faulty, it is bypassed and logical indices (i',j') are associated with cell {i,j + 1) for 
cells (i,J:), k > j. Figure 2.20 shows the result of one such reconfiguration. The 
fatal failure condition is reached whenever there are two faulty cells in a row. This 
problem can be overcome by adding one spare row and one spare column to the 
array and slightly modifying the algorithm. The modified algorithm is as follows: 
• the array is scanned from top to bottom 
27 
Figure 2.21: Modified Fault Stealing Method 
• if in row ·i there is only one faulty /stolen cell, rightward rcconfigura.Lion is 
performed for that row, 
• otherwise, the rightmost faulty or stolen cell invokes rightward reconfigum-
tion, while all other faulty or stolen ones steal the functions of cells in t.bc 
corresponding positions of row (i + 1) making them stolen cells. Stealing hy 
(i,j) implies associating logical indices (i,j) with the stolen cell. 
Figure 2.21 shows one such reconfiguration. 
In this case, a fatal failure condition is reached when a stolen cell is faulty. The 
locality is high in this case also. Here, a faulty cell (i,j) can be shifted t.o a. fault. 
free cell ( i, j + 1) or ( i + 1, j) The set consisting of cells ( i, j), ( i, j + 1) and ( i +I, j) iH 
referred as an adjacency domain. This adjacency domain can be ext.end(!d and t.he 
algorithm can be modified to get more spare utilization. The modified approach is 
28 
called complex fault stealing (14]. 
2.3.6 CFS (Complex Fault Stealing) Method 
In this scheme, a spare row and a spare column are provided to the N X N array 
and the algorithm is as follows: 
• assume that in row i, lsisN there are faulty or stolen cells (i, kl), ... , (i, k,) 
• for each k;, 0 < i < s : 
a- if (i + l,ki) is fault free, (i,k) is shifted to it, 
b- else, if ( i + 1, k; + 1) is fault free, ( i, k) is shifted to it, 
c- otherwise, (i, ki) is shifted right. 
• if no cell is shifted right along the row as per the previous rule, then ( i, k,) 
is shifted right. Otherwise (i, k,) is shifted downwards to either (i + 1, k,) or 
{ i + 1' k, + 1). 
An example of this algorithm is shown in Figure 2.22. Here, (1, 2) is shifted 
twice, first to ( 1, 3) and then to (2, 4 ). The interconnection links required by this 
algorithm are very complex [15] [19]. 
2.3.7 FUSS (Full Use of Suitable Spares) Method 
This scheme uses an indicator vector, called the surplus vector to guide the re-
placement of faulty cells in an array. In its ideal case, FUSS achieves 100% spare 
survivability. In FUSS-C, the array is an M x (N +C) array, where Cis the number 
of spare columns (spare rows are not used). First, the surplus vector of the array 
is computed. Let /i be the number of faulty cells in row i. The surplus vector 
(S-vector) is defined as 
29 
Figure 2.22: Complex Fault Stealing Method 
i 
where s; = l:(G- /i) is the surplus of row i. 
j=l 
Next, 
• if s; > 0, then the sum of spares in rows 1 through i is greater than the number 
of faulty cells in row 1 through row i; so row i ha.s extra. cells available for usc 
by faulty cells in rows i + 1, i + 2, ... M, 
• if s, < 0, then row i has a deficit and needs to usc available cells from row 
i + 1, i + 2, ... , M, 
• if SM < 0, then the total number of spares in the array is less than the number 
of faulty celJs. In this case the array is not rcconfigurable and fatal failure 
occurs. 
In FUSS-C, an unavailable cell ( i,j) can be shifted down to ( i + 1, j) if Si is negative 
or shifted up to (i- l,j) if s,_1 is positive. 
30 
011000 
011100 
000000 
Ol0l01 
ARRAY 
011000 
011100 
030200 
010101 
B- MATRIX 
S-vec:tor 
0 
-1 
+1 
0 
~ ~ 
~ ~ ~ 
Figure 2.23: FUSS Scheme 
After each step the corresponding entry in the surplus vector is readjusted to-
wards zero. Its effect can be described as a cell migration from regions having most 
faulty ceJls to regions having less faulty cells. 
Consider a 4 x (4 + 2) array shown in Figure 2.23 (FUSS-2 Scheme), where 
'0' represents a. good cell and '1' represents a. faulty cell. The reconfigura.tion is 
executed as follows: 
• scan the array downwards. When Si < 0, shift a number equal to jsi J of 
unavailable cells to row i+ 1 a.nd when successful, reset Si to 0. Here, s2 = -1, 
so one cell (2, 2) is shifted down from row 2 to row 3 and this is assigned a 
status code of 3, 
• scan the array upwards. When Si > 0, shift lsil unavailable cells in row i + 1 
to row i; .si is reset to 0 when all s; cells are shifted successfully. Here, S3 = 1, 
so one cell from row 4, cell (4,4) is shifted up to cell (3,4) which assumes the 
status code of 2; s3 is readjusted to 0. 
Now, the surplus vector is 0, which means that the fault shifting is successful. The 
31 
Figure 2.24: Interstitial Redundancy Scheme 
status matrix (B-matrix) has the status codes that guide the cell interconnection 
phase of FUSS. Entry bij has the following meaning: 
• b;; = 0, if (i,j) is fault-free 
• bij = 1, if (i,j) is faulty 
• b,; = 2, if (i,j) replaces (i + l,j) and 
• b;; = 3, if (i,j) is replacing (i- l,j). 
Now, since the status of the cells is known, it is eMy to derive the intercon-
nection between the cells. In this algorithm, the probability of survival improves 
and fewer cells are wasted. However, the algorithm becomes more complex anu the 
interconnection requirement is increased. 
2.3.8 Local Redundancy Methods 
In these schemes, the array is partitioned into smaller arrays, each of which can be 
reconfigured independently. The main objective of these schemes is the minimiza-
tion of the interconnection delays. One such scheme is discussed next. 
The scheme is called interstitial Tfdundancy and it maintains short interconnec-
tion links. 
32 
The array is divided into a. number of subarrays (clusters) and one spare is 
allocated to ea.ch cluster. The array shown in Figure 2.24 has 25% redundancy. 
Ea<".h cluster is independent and it can tolerate one faulty cell. The spares are 
physically close to the faulty cell they replace [20). 
In these schemes, if reconfigura.tion is not possible within a block, the system 
fails unless the faulty block can be replaced by a functional one. To avoid this, the 
array can be organised in a hierarchical way. One such scheme is CHiP ( configurable 
highly parallel) architecture, made up of building blocks, each of which is a two 
dimensional CHiP array [21). 
The cut methods are simple but they are not efficient. In the slanted R-S cut 
method, sometimes it is difficult to get an optimum cut. The switching scheme is 
very simple for these methods. 
The fault stealing and FUSS methods are very efficient but their algorit hms and 
switching structures are complex. 
Some of the above schemes cannot be used effectively during run-time because 
every time a. fault occurs, the full algorithm has to run and it may completely 
change the previous reconfiguration. These algorithms are suitable for improving 
the production time yield. 
In the next chapter an on-line reconfiguration scheme is proposed for P E fail-
ures. 
33 
·, 
Chapter 3 
ON LINE 
RECONFIGURATION 
On line reconfigurat. m is performed to increase the reliability of the system for the 
full duration of a mission. Here, in the case of a fault detection, the array is not 
flushed as required by the previous algorithms. 
The reconfiguration scheme should be capable of: 
• fault detection: if the fault is not detected, the array fails and this failure 
cannot be detected by the central processor; this is an unsafe failure; 
• fault location: the fault location is important in order to replace the faulty 
P E by a non-faulty P E; 
• re-routing: the scheme should be capable of mapping the new logical index 
on to the physical index and 
• fault blocking: to ensure that the faulty data are not passed on to the next 
P E, otherwise all the further computations would use the faulty data and all 
the results would be faulty. 
A major concern for an on-line reconfiguration is complete use of non-faulty par-
tial results. During reconfiguration the fault-free partial results should be handled 
properly. 
34 
\ 
The reconfiguration scheme should have following attributes: 
• simplicity of algorithm: the algorithm should be simple, so that it causes little 
disturbance in the array. Here, disturbance refers to the total number of PEs, 
for which the logical index changes. 
• minimal additional hardware; any additional capability requires some extra 
hardware, which depends on the algorithm. The algorithm should use min-
imum additional hardware otherwise the additional hardware would bring 
down the array reliability instead of improving it. 
• use of fault-free partial results: in systolic arrays, partial results are passed on 
to the next cell as input. In the case of a fault-occurrence, the faulty partial 
results should be blocked and the fault free partial results should be ideally 
utilized to best advantage. 
• locality: the locality of the data is one of the major attributes of systolic 
arrays and the reconfiguration algorithm should maintain it. It is maintained 
by using the global deformation instead direct replacement. 
A scheme is proposed in the following section for on-line reconfiguration that has 
these attributes. 
3.1 On-Line Reconftguration Scheme 
This scheme does not perform any on-line testing, so self-testing PEs are required. 
When a P E detects any fault, it invokes the reconfiguration. The follo~ing as-
sumptions are made. 
Assumptions: 
• the faults are occurring one at a time; 
• the links and the switching network are fault-free; 
35 
• once a fault occurs, it is detected by the P E; 
• the control circuitry of PEs never fails; 
• a central processor provides input and clock to the array and it receives output 
and fault occurrence signals from the array and 
• the occurrence of a failure is reported to the central processor before the 
arrival of the next rising clock edge. 
The array is provided with an extra. row of PEs {called spares) and these spares 
do not perform any useful operation during the normal operation. These cells do 
only self-testing and remain non-active for other operations. Once a P EiJ ( P Ei,j 
denotes the PE whose physical index is (i,j) and P Ef; denotes the P E with !ogical 
index {i,j)) detects a fault, it marks itself as bad and the reconfiguration is done 
as follows: 
• if P E,,j is a non-active spare, no shift is done; 
• if a working P Ei,j fails and the spare cell, P Erow,j, is available, P EiJ invokes 
a dOWIIWard shift; 
• else a fatal failure occurs. 
For example, if in the array shown in Figure 3.1, P E 3,2 fails, no shift is performed 
but it is marked as a bad P E. But when P E2,1 fails, it checks the availability of 
spare cell, P E3,1, and since this spare is available, the reconfiguration is done and 
a downward shift is performed for all P Ez,l, where i ~ x ~ row - 1 (here, i=2 
and row=3). After this failure, if any P E fails in column 1, the algorithm cannol 
tolerate the fault and a fatal failure occurs. 
Similarly if a spare, such as P E3,1, fails first, then any further failure in column 1 
would result into a fatal failure. 
36 
PUT IN 
GL OGK 
0 ACTIVE CELL 
©) SPARE CELL 
Figure 3.1: Proposed On-Line Reconfiguration Scheme 
atagin~ logic 1- stagin{J logic r- .. stagin!J logic r-----0 
latch ckt. latch ckt. latch ckt. 
~ t t 
l 
--
l 
Figure 3.2: Staging Latch Position in Normal Arrays 
3.2 Implementation 
UTPUT 
In most systolic arrays, staging latches are provided at the input end of the P E, as 
shown in Figure 3.2. 
The clock is applied to these latches for propagation of data. When a dock 
edge arrives, P Ei latches the data from P Ei-t and it is available to P E, for the 
full duration of a clock pulse. 
Now, consider the one-dimensional pipeline shown in Figure 3.3. 
During normal operation, each P Ei gets input from the output of the previous 
P Ei-t• Here, inputs and outputs are written as I! and O!, meaning that J! is the 
37 
CLOCK 
~:-Ba 
a 
___n_r 
* 
I • 
* I I 
I I 
t tl t-H. 
CLOCK CLOG I< b 
Figure 3.3: Pipeline, Before and After reconfiguration 
input of P E:e at time t and O! is the output of PEr at time t. Similarly, I!,L and 
O!.,L denote the input and output of P E~ at time t respectively. For the pipeline, 
shown in Figure 3.3, at any time t, If = 0~, I~ = 0~ ... and so on. At any time 
t1 ( t < t1 < t + 1 ), each P Ei is processing the data., which was available at its 
input at time t. Since we have the staging latches at the input end, the failure of 
P E; at time t 1 makes the data available on link L~+l (the link between P Ei and 
P E;+t) erroneous. If a spare is available at the rightmost position of this pipeline, a 
rightward global shift is performed and the pipeline would look as in Figure 3.3(b). 
Now, PEa+t acts as PEf and since the partial result generated by PEi is faulty 
at time tlt it must be recomputed by P Ef. For generating a:•, P Ei+l requires the 
same input, which was available to P Ei at time t, but this data is not available at 
ft because at time t it was generated as 01_1 by P Ei-l and after the clock edge the 
P Ei-l receives new input Jf_1 and changes the output. 
To overcome this problem, the staging latches are shifted from the input side 
to the output side and the new pipeline is shown in Figure 3.4. 
38 
IN PUT logic stagin~ 1- logic stagin~ .. logic staging _,.o UTPUT 
r.kt. latch ckt. latch ckt. latch 
l f t 
OCK 1 I 
--
I CL 
Figure 3.4: Modified Staging Latch Position 
In this case, the links and output ports never carry the faulty data, because the 
moment a. fault is detected by any PE, the PE requests the central processor to 
block the clock. Here, in the case of P E, failure at time t1, 0!_1 is available at the 
output port of P Ei_1 and it can be used by the P Ef. Once P Ei changes its logical 
index, it has to use the weight (static coefficient), which was being used by P Ei-1· 
This is discusRed in the next subsection. 
3.2.1 Loading of Weights 
When an array is implemented, it is not possible to connect all the static coefficient 
latches to the external ports {which are used to connect the array and the central 
processor) due to extensive link requirements. So usually the input line is used to 
load the static coefficients in the array before the array begins processing data. In 
most systolic arrays, one of the data streams (either vertical or horizontal) passes 
through the array without getting modified and this feature is used to load the 
static coefficients. In the following discussion, it is assumed that the vertical data 
stream is not modified. This can be generalized to the horizontal data stream also. 
A simplified block diagram of aPE is shown in Figure 3.5. 
We can use either of the following two methods for loading static coefficients in 
the array. 
39 
. lnputJCoef f 
Clock 
----- -----------------------------I I 
, 
. . ~ I I I 
LOGIC IV-OUTPUT I I __'!_ I BLOCK LATCH J I I 
_tw I 
I 
STATIC i_ H-OUTPUTl I L._. COEFF. ~ LATCH J 
LATCH 
Load 
~-Aiib-R:Ess i 
L-~~i?-~~~J I) 
I L----J 
t I I 
I I I 
,...I -~"'-- ---J-------------------------- --·-· L---------------------------------J 
Address 
(required only for random loading) 
O" 
Figure 3.5: Block Diagram of P E (with emphasis on Coefficient Loading Circuit) 
Method 1 (Sequential Loading) -
Here, the coefficients, Wi,j are loaded into P E1,j by presenting wi,j on vertical input 
line 1r in sequence Wm-t.;, Wm-2J ••• and after m - 1 clock pulses ( m is the total 
number of rows in the array, one bottom row of spares is added • making the total 
number of rows m + 1), each PEiJ would have its static coefficient Wi,j at its input 
port. Now the input/coefficient line is made valid for coefficient (informing the PEs 
that the data. available at their vertical input port is the static coefficient) and the 
clock is applied once. The clock causes the PEs to store the data available at the 
vertical input port into the static coefficient latch. 
Method 2 (Random Loading) -
In this meth\ld, some extra hardware is added in the PE and an additional address 
bus is provided which carries the address of the P E, to which the static coefficient 
available on the input port (Ij) belongs (see Figure 3.5). 
40 
V-DATA 
--r- LATCH ~ -~.--
I 
-~a ---. lnpul/Coef f 
X-DATA y 
- LATCH ---H-
-
Clock 
Figure 3.6: Output Latch Block for Random Coefficient Loading 
A multiplexer is used in the output latch block to bypass the output latch (see 
Figure 3.6), when coefficients are being loaded. In this case, each P Ei.; (0 < i ~ 
m; 0 ~ j < n) gets the same data which is available on input port I[. Firstly, 
weight w1,j is put on port 1r and then the address of P E;.J is put on the address 
bus and clock is applied to store wi,j in P EaJ· This scheme requires extra hardware 
and random loading is not essential in most cases, so it is rarely used. 
When the array is operational, it is not possible to load the static coefficients 
without losing some information available in the PEs, because the P E' s output 
ports carry the partial results. So when a shift is performed in the case of P Ei,i 
failure, it is not possible to load the new weight Wz-l,j in P Ex,j ( i < x :5 m) without 
losing some of the partial results. To overcome this problem, one more static 
coefficient latch is added in the PEs and the ]a.tches are called static c~efficient 
latch 'O'anJ static coefficient latch '1'. lnitit1lly the PEi,j uses the static coefficient. 
latch '0' (carrying WiJ) and in the case of a P Ei,j failure, the P E::J ( i < x < m) 
start using the static coefficient latch '1' (carrying Wzo-l,i ). An additional line select 
0/1 is used to help the p:toper st<.Jring of static coefficients. This avoids the need 
41 
RR 
lnput/Coef f 
Clock 
Select0/1 
~--------------------------------------
I 
' ' 
I 
I Load'O' STATIC STATIC I COEFF. r COEFF. I 
' Load'l' LATCH '0 LATCH '1 ~ I 
+1 ~ I _tV OUTPUT I 
I SELECT 011 J I LATCH 
I ~ t 
I BLOCK 
I LOGIC 
-!-I ~ I CIRCUIT I I I 
I 
·- I I I 
I I 
I t 
....L-
I I 
I I 
L--------------------------------------1 
Figure 3.7: Block Diagram of P E With Two Static Coefficient Latches 
to reload at the time of reconfiguration. The block diagram of PEs is shown 
in Figure 3. 7. RR (Reconfiguration Request) is a signal, which comes to P Er,i 
(i < x ~ n) in the case of P Ei,j failure (it is explained in next subsection). 
In this case the coefficients are loaded initially using method 1 (explained ear-
lier). Initially select 0/1 line is made valid for latch 0, so at clock m - 2 (because 
there are rn active rows in the array, namely row 0 through row (m- 1) and clock 
pulses are counted from pulse 0), Wi,j is loaded in PE,,j (0 ~ i < m) and Wi,i 
(0 s; i < m) appears at the input of P Ei+t,;· At this point, the line select 0/1 
is made valid for latch 1 and the next clock pulse, m - 1, loads wi-t,j in P Ei,j 
(0 < i ~ m). 
During reconfiguration, rerouting of data is done, so a switching network is 
added to facilitate the rerouting. For an active array of size m X n, a physical array 
of size (m+ 1) X n (PEo,o through PEm,n-t) is required and to support the routing, 
a switch array of size (m + 2) x (n + 1) (So,o through Sm+t,n) is required. The 
complete array is shown in Figure 3.8. Ii" and IJ represent the horizontal input 
42 
of row i and vertical input of column j from the central processor respectively. 
Similarly, Of' and Of represent the horizontal output of row i and vertical output 
of column j from the array respectively. 
Each switch module shown in Figure 3.8 is a pair of switches (one is used for 
vertical routing and the other for horizontal routing). For the sake of clarity the 
vertical and horizontal paths are shown separately in Figure 3.9. 
In the next subsection a scheme is proposed for proper handling of partial results 
in the case of P E failure. 
3.2.2 Handling of Pru·tial Results 
Consider the array shown in Figure 3.8. When a P Ei,j fails, it invokes a downward 
shift (if the bottom row spare is available) and th~ logica~ index of PE:r:J (i < x < 
m) changes from (x,j) to (x- 1,j). For the sake of clarity, vertical and horizontal 
partial result handling are explained separately. 
Handling of Vertical Partial Result 
At any time t1 (t < t1 < t + 1; shown in Figure 3.10), the PEs are processing 
the data which a.re available at their input ports at time t1 because the data were 
latched by the output latches of the previous cells at timet and they remain there 
till the next clock edge, t + 1 comes. I{!/ and I'j' denote the horizontal and vertical 
inputs of P EiJ at timet respectively and O~' and o'(:l represent the horizontal and 
vertical outputs of p EiJ at timet respectively. Similarly, I/:;·.1, r[J~L' o~.'L and Oi.i!L 
denote the horizontal input, vertical input, horizontal output and vertical output of 
PE{J (PE with logical index (i,j)) at timet respectively. When PEi,i fails at time 
t17 it inunediately generates a Reconfiguration Request {RR) and passes it to the 
central processor, which delays the next clock edge, t+1 for a pre-specified duration 
(which depends on the time taken for the switch settings and the processing time 
43 
. 
So,o 
D 
0 0 °~1 
0 0 c( If 'b QPEo,o O 0 
0 0 OF 
0 0 c( 
0- PE 
0 0 0 0 0- Switch 
Figure 3.8: Basic Array with Switch Modules 
Horizontal Data Routing Vertical Data Routing 
Sobooo 
-t1Dtr[ttJ 
LJuDu 
UotJ'otJ'otJ 
tl'trb_a'd 
Figure 3.9: Vertical and Horizontal Data Paths 
44 
vf each P E). RR.s are written as RR)c, which means that the RR is generated 
by X and it is fed to Y (for examplet RR~~iJ denotes the reconfiguration request 
generated by P Ei,j and it goes to switch SiJ+l ). Since the logical index of P E:c,j 
(i < :z: :5 m) has changed from (z,j) to (z -l,j) at tt, the PE:c,j (i < x :5 m) has 
to process the same data, which P Ez-l,j was processing at time t1; for instance, 
after t 1, P Ei+IJ ahould get OJ'.:'1,;, P Ewl,j should get or} and so on, meaning that 
I V,c oV,c IV,c oV,I d "' li h th' . t d' t t t i+IJ = i-l,jt i+2J = i,j an so ou. J.O accomp s 1s, an m erme 1a e s a e 
of the vertical path is provided (shown in Figure 3.10), which is called the first ot 
intermediate stage of rerouting. At t11 the switches S~rJ+I (i $ z S m + 1) are 
set to provide this routing and the next clock is applied at t + 1, which causes the 
intermediat~ results t.o appear on the output ports of the PEs. At this time the 
switches S:c,j+I ( i < x ~ m + 1) are set again to get the final reconfigured vertical 
th ( h · F' 3 to) Art fi a1 t' Iv,c ov,c (Iv,e ov,t ) pa s own 10 1gure . . er n rou mg i,j,L = i-tJ,L iHJ = i-l,j , 
and so on. 
The horizontal partial result handling is explained in the next subsection. 
Handling of Horizontal Partial Result 
After the PEid failure at t17 each PEs:,; (i < z < m) has to work as PE~-1 ,; 
and each P E:cJ ( i < x ~ m) has to get the same horizontal input as P E:c-lJ wa~ 
tt. t t' t . IH.tt I"·c oH,c lu.tt - Iu,c - o"·' d ge mg a 1me 1, I.e., i+t,j = i+t,; = i,j , i+2,j - i+2J - i+l,j an so on. 
To accomplish this, a.n intermediate horizontal routing is done at t1 (as shown in 
Figure 3.11) and at t + 1 the final routing is done to get the final reconfiguration, 
so that each PE:cJ+t (i ~ :z: < m) gets horizontal partial result from PE111+1d· 
Lemma 3.1 - Reconfiguration in the case of aPE failure requires a maximum of 
two stages of rerouting. 
Proof- There are only two combinations of P E failure: either a. spare P E fails or 
a.n active P E fails . 
45 
PE1-1J 
-____n_n__jL. 
4. • 4 
t t I I 
fttt+lt+2 
( normal clock} 
0 4 I 44 4 I I I 
tttt+lt+2 
(Clock after failure) 
(Normal working) (1st stage (Final reconfiguration) 
of reconf.) 
Figure 3.10: Vertical Partial Result Handling 
PE· · .. ,
-- normal routlns 
- - - - let. •ta&e routlr~~ 
.. .. .. .. .. final chan&e 
Figure 3.11: Horizontal Partial Result Handling 
46 
1. When a spare P E fails, it docs not invoke any reconfiguration and 
2. when an active PE fails, it invokes the reconfiguration and as explained ear· 
lier (in vertical and horizontal partial result handling subsections), any such 
failure requires two stages of rerouting (intermediate stage and final stage). 
0 
In the next subsection switch modules are discussed. 
3.2.3 Switch Module 
As explained earlier, each switch module consists of two switches. One of them is 
used exclusively for horizontal data routing and the other is used for vertical data 
routing. Both of them are discussed separately in the following subsections. 
Vertical Data Routing Switch 
Consider the array shown in Figure 3.12 (only vertical data paths are shown). 
Here, 1:, 1r, If ... are the input data from the central processor to the a.rray and 
Oci, or, Of ... form the final output from the array. 
At timet~, P Ei+I.i-l has already failed (and has been reconfigured) and P Ei,j 
fails at time t1 causing the first stage of rerouting to be done. So in this figure, col-
umn (j -1) of the switches shows the vertical data path, which is fully reconfigured 
and column (j + 1) of the switches shows the vertical data path in the interme· 
diate stage. To support the reconfiguration, the network shown in Figure 3.13 is 
provided. 
It is clear from the network that the swit.ch modules S:r:,o (O ~a:~ m + 1) need 
not have the vertical data routing switch. Each switch is a 2 x 2 switch, the inputs 
are denoted a.s IIo, 1¥1 and the outputs are written as oro and 0~1· The vertical 
input of the array, I'! is given to the 1%0 input of SJ'.z+t (0 ~ z < n) and the I'!t 
of these switches is not used. Final vert.ical output, O!' is taken from the array by 
47 
JV l JV 'l if I\' ·I IJ' 
__fl_Il__J 
A A A A 
I I I I 
t t, t + l t + :! 
(Normal Clock) 
___n__n__r 
A A A A 
I I I I 
t t l l+l t+2 
(Clock nft.cr fa.i lure) 
0 Act.ivc PE 
(0) Spa.rc PE 
® Faully PE 
06 ov ov at· ov or 1 2 3 ·I 
Figure 3.12: Vertical Data Routing Path 
. Tci [V I [V .2 [V . 3 
D 
D f.~o If, 
D 
D 
0~ ov or or oro OstV 1 
Figure 3.13: Net.work for Vertical Data Handling during Rcconfigura.lion 
48 
;; 
o¥o 0¥1 
S1;Y 
Figure 3.14: Stat.cs of Vertical Switches (For P E failure algorithm) 
OX0 of S!:+l,x+l' In order to get all the required connections, the vertical switches 
have two states (shown in Figure 3.14). 
[nit.ially, all the switches SlJ (0 ~ i ~ m- 1 and 0 ~ j ~ n) are in state STl, 
the switches SiJ (m :5 i ~ m + 1 and 0 ~ j ~ n) are in state STt and when a 
PBi,j fails at t., it changes the states of switches s~j+l {i ~X~ m- 1) from STo 
to S'/'1. At t + 1' switches s~j+l ( i + 2 ~ X ~ m + 1) arc brought back to state 
57~'. 
Lemma 3.2 - The two proposed states (ST6 and STt) of vertical switches are 
sufficient to support the algorithm. 
Proof~ As shown in Lemma 3.1, aPE failure requires two stages of rerouting so 
a vertical data path can be in either of the following three states: 
I. the particular data path doesn't have any faulty P E; 
2. the particular data path has a faulty P E and the reconfiguration is in the 
intermediate stage or 
3. !.he particular data path has a reconfigured faulty P E. 
The data paths required by these states are shown in Figure 3.12. Since P E1+1,j-2 
has been reconfigured completely, column (j - 1) of the switches shows the data 
49 
paths required by final stage. P E,.; failure has gone through the first stage of 
rerouting only, so column (j + 1) of switches shows the data paths required by 
the intermediate stage of rerouting. Other columns of switches show the normal 
data routing. It is obvious from Figure 3.12, Figure 3.13 and Figure 3.1-1 that 
the proposed two states of the switches provide all the required data paths. For a 
column z of PEs, if no P Eit.z (0 S i1 ~ m - 1) is faulty, the spare P E, P Em,:r 
is bypassed by bringing switches S~..-+t and S~+t.z+t to STr. Other switches of 
column ( x + 1) would be in STci. P Ei+tJ-'J failure is reconfigured completely and 
the data paths, required for this are provided by bringing 8~1.;_ 1 and 842.;_1 to 
srr. Other switches of column (j- 2) stay in ST:. P EiJ failure is in intermediate 
stage and data paths are provided by bringing switches s~.j+t (i ~ il s m + 1) to 
STt. Other switches of column (j + 1) stay in STci. o 
Horizontal Data Routing Switch 
The horizontal routing is shown in Figure 3.15. At lt, P Ei,;-2 has already failed 
and has been reconfigured completely and P Ei,; fails at this point, causing the 
first stage of reconfiguration. The network, illustrated in Figure 3.16 is provided to 
support the algorithm. It is clear that the switches So,; (0 ~ j < n) need not have 
the horizontal switch. The horizontal input to the array, JiH comes to the Il~ port 
of switch Sft,.1,0 for all (0 ~ i < m) and the output 0[1 is taken from the OY0 port 
of switch Sf.n. The various switch states for a switch S!,~ are shown in Figure :J.17. 
When a P Ee,j fails at t 11 it changes the states of S!~; ( i < z < m + 1) from 
ST0" to STfl and at time t + 1 next clock edge is given which changes the swi tchc!l 
S~Ht ( i < x ~ m + 1) from ST0H to ST[i. At the time of switch settings, the PEs 
are informed to use the proper input port, on which the correct data is available. 
The above scheme is valid when for a PEe,j failure, there is no faulty PEH,j-1 
50 
'"D D D D D D 0 0/! 
~ ~ 
I 
Reconfigured 
I 
After 1st 
fault re-routing 
M A A 
II I I 
t lt t + 1 t + 2 
(normal clock) 
* 
~ 
I 
t + 1 
I 
t + 2 
(clock after failure) 
0 active PE 
© spare PE 
f8! faulty PE 
Figure 3.15: Horizontal Data Rotlting Path 
0/! 
QH 
1 
JII so QII so 
Figure 3.16: Network for Horizontal Data Handling during Reconfigur(ltion 
51 
Jffo 
I ll 51 Qll St ]II Sl au St 
Figure 3.17: States of Horizontal Switches (For P E failure algorithm) 
or P E;1,;+1 present. In presence of any such faulty P E, the algorithm is changed. 
Both of these cases are explained below: 
a. PE;1,;_1 Faulty: the array is shown in Figure 3.18. After the intermediate 
rerouting at t1 , PE~.; (i :5 x ~ m; which is PEz+1,;) should get data from PE~i-l 
{which is p Er+t,j-t}, so the switches s:.j-1 (i < X ::; m + 1) change state cith('r 
from STf' (caused by previous reconfiguration due to PEit,j-l failure) to STJ1 or 
from ST011 to ST111 • The final reconfiguration at t + l changes the states of the 
switches s:.i+l (i < :r: :5 m + 1), from ST011 to ST/1• 
b. P E,1,;+t Faulty - the array is shown in Figure 3.19. Here, at t1 a.ll the switches 
Sf;-t (i < :r: :5 m + I) change state from ST011 to ST111• After the rerouting at 
t + 1, PE:J+l (i :5 :r: ::; m) should get data from PE:J. To achieve this, at t + 1 
the switches s:,;+t ( i < X ::; m + 1) change state either from STJ1 to STf1 or from 
ST[I to STJI. 
Lemma 3.3- The two proposed states (ST011 and ST/1 ) of the horizontal switches 
are sufficient to support the algorithm. 
Proof - Horizontal data routing, in the case of a P E;,; failure depends on earlier 
failures. There are only four combinations of this occurrence, which are listed 
below: 
52 
DDDDD 
A A A A 
I I I I 
t t, t + 1 t + 2 
(normal clock) 
* * 
I I 
t + 1 t+2 
(clock after failure) 
Figure 3.18: Horizontal rcconfiguration for P EiJ in presence of faulty P Eit,j-t 
© b . . · © DDDDD 
---- links before t1 
-------- links after 1st reconfiguretion 
... .. ... ............ links changed during 
2nd re-routing 
Figur<.' :J.l9: Horizontal rcconfiguration for P EiJ in presence of faulty P Eil,j+l 
53 
1. no P E in columns (j - 1) and (j + 1) is faulty; 
2. column (j -1) has a faulty P E (P EitJ-d and it has been reconfigured (it is 
assumed that faults occur one at a time); 
3. column (j + 1) has a faulty PE (PEit,j+t) and it has been reconfigured and 
4. both columns {j- 1) and (j + 1) have faulty cells (PEit,j-t and PE12,j+t 
respectively). 
As shown in Lemma 3.1., only two stages of rerouting are required in the case 
of a P E failure reconfiguration. For horizontal data rerouting, in the case of P Ei,; 
failure, the intermediate stage requires modification of data links between PEs 
of column (j- 1) and PEs of column j and the final stage of rerouting requires 
modifications of data links between PEs of column j and PEs of column (j + 1 ). 
Cases 1, 2 and 3 are shown in Figures 3.15, 3.18 and 3.19 respectively and it is 
clear that the proposed two states of horizontal switches are capable of providing 
all required data links. Case 4 is the combination of Case 2 and Ca.<Je 3 and since for 
horizontal data rerouting, intermediate and final stages of rerouting arc mutually 
exclusive (the intermediate stage requires state changes of switches in column j and 
the final stage requires state changes of switches in column (j + 1) ), the intermediate 
rerouting in this case would be similar to that of Ca.<Je 2 and final rerouting would 
be similar to that of Case 3. So the proposed two states (STJ1 and ST{1) would 
provide all horizontal data paths required by the algorithm. 0 
So, in the case of a P Ei,i failure at t., the switches S£~;-t (i < z ~ m + 1) 
change state either from ST0H to STf1 or from ST(I to STJ1 at t1 and at l + 1 
switches s~j+l perform the same. 
The switches are finite state blocks. In the next subsection various changes in 
the basic network, processing element and switch modules are explained. 
54 
3.2.4 Network 
The network is modified to implement. the algorithm and it is shown in Figure 3.20. 
In this figure, global clock line (CLKps), inputjcoef ficient line, select 0/1 
line (used for loading the coefficients initially), reset line (used for resetting all the 
flip flops initially ) and fatal failure line (explained later) are not shown. Various 
control lines for a P E and switch module are shown in Figure 3.21. 
R~PEE~+a,, is the reoonfiguration request from P Ei 1· to P Ei+t 1· and RRp5;s1'' 1 is t,J t t I,J 
the reconfiguration reqnest from P Ei,J to switch Sit,jt· F F is connected to the fatal 
failure line, which indicates the occurrence of fatal failure. Once a P E fails, the 
reconfiguration starts and it is done based on the information available on these 
lines. When a faulty PEi,; receives an RR from PEi-t,j, it generaten FF (fatal 
failure signal) and puts it on the FF line, which carries it to the central processor. 
3.2.5 Processing Element 
The block diagram of the processing element is shown in Figure 3.22. Each P Ei,i 
has two static coefficient latches and two horizontal inputs (1/!Eo and 1~~1 ) and the 
selection is done by using the signal select011 line, which becomes high when P Ei,i 
. R...PE;, 
rccetves .Up£;~1 .,. 
The SP E (spare P E) signal is applied to the spare cells initially and it is 
latched to derive SPEL, which is used to ensure that the spare cells do not invoke 
reconfigura.tion. The P E test circuit checks the state of the P E and when it 
detects a fault in the logic circuit, it generates Ewarc, which remains high for the 
full duratiou of the array operation. The block diagram of the control circuit is 
given in Figure 3.23 and timing diagram of various ~ignals is shown in Figure 3.24. 
Once a fault is detected by the self-test circuit of P E,d, it passes this information 
on to the control block of the P Ei,j using the line EwaJC. When the control circuit 
of P Ei,i receives this Ec..oaw (see Figure 3.24.a), it generates RR~~~+J,;, RRp5j~~ 1 
t,J •·J 
I/! 
' .... 
[II 
1 
... 
... 
Jll 
2 
... 
... 
rv 1:' [\' \ ' 0 2 /3 
c;J, 
... ... 
' ' ' ' ... 
... . 
'~- " " ... " 
' 
... .... ... 
' 
.... ... ... 
'q-.. .... " 
' 
.... ... 
' 
... 
' ... 
.. .. 
... 
'~- .. · . ... " : ·. ... .... " " 
or \' ! or 0\' : 0 0 o. 0 D :1 0 SPE v SI'E v Sl'f: v Sl'f: v 
' 
----------------,; 
----------------' 
/l/l.s to central processor 
Vertical data ----- Ilori;~,ontal <.lata <:OIIt.rol I i III!S 
Figure 3.20: Modified Network (for supporting /J 8 failure a.lgorit.hrn) 
pg . 
1,) 
v 
I)/)''>', I t •• ,,,. 
:.~•-1 . 1 
0"'·' 
.. 
Figure 3.21: Control Lines for PE and Switches (PI~ failure algorithm) 
56 
()II (I 
.. 
()II 
I 
" 
O.~' 
, 
I ll /'EO 
I ll p[;;l 
RRI'B,,1 P ,.. I . 1!11- ,J Vertical Input 
----------------------- ----------------- · I 
I 
I 
Static - I I 
·=- Codf. I ~ I 
Lat.ch '0' M I LOGIC ~ OUTPU'l u ~ 
StalJc 
X CKT. ., Cocff. ~ ~ LATCH f--Latch '1' 
-r-
t I MUX I 
I 
I 
I 
I PE TEST CIRCUIT J I I ~LOGIC I •eleclo/t I 
I L-~ CONTROL CIRCUIT 
~--- -}-.--- -~----- ~-----{----- -~---- -t------- --
Figure 3.22: Complete Block Diagram of Modified PE (P E failure algorithm) 
5i 
all PE 
I 
~ . 
( 
i 
,. 
t· 
.. 
·. 
., 
~ 
! 
I 
'• 
and R~'£,1~1 • These R& are reset at t + 1, but Ewarc stays high. Now, if I' f.:i,j 
receives RR~~:~.oJ, it would generate the F F signal (because this indicalcM two 
faulty PEs in the same column). 
When PEiJ receives RR~~:~,,, at t1 (see Figure 3.24.b), it generates RR~~:;:•·,, H u~·;~.~,' 
and RR:J,t·, at t1 and at the next clock edge, t+ 1 it resets these RRs and gt•rwratcs I,J 
R/t.,PiE+t,J+t, which is reset at the nex:t falling edge of the clock, at t'l. In this ms<•, if 
I,J 
P Ei,i fails at t3, it generates F F. 
3.2.6 Switch 
The block diagram of the switch module is given in Figure 3.26. Each swil.d1 
consists of three basic circuits: one control circuit, one horizontal switch (used for 
horizontal data routing) and one vertical switch (used for vertical data routing). 
The control circuit is very simple in this case and the horizontal switch toggle:~ 
from one state to the other, when either RR~~ 1 or nlt.,5··.', comes. The vertical 1- ,J I ,J 
switch goes to state ST{' when RR~·~ •. 
1
_ 1 comes and it goes back to S7'J at the 
. al f RR5'•j arnv o PE,_1,1 _ 1 • 
Consider a portion of the array as shown in Figure 3.25. When P Ei,i fails at t1, 
various RRs are generated. The control circuit (shown in Figure 3.23) is used to 
generate these signals. The RKop~m+t,, of the bottom row of cells is conn(!ded to 
m,J 
the central J,rocessor, which delays the next rising edge (t + I) of the clock. ThiM 
delay is the sum of the switch settling time and the processing time of a P 1!:. The 
arrival of pulse t + 2 is also delayed by the same amount of time and after that the 
clock resumes its normal speed. 
The central processor gives a signal called SPE (spare PE) to the spare PE.~ 
and it is used to bring S~.i and S~+l.i to ST.V initially. SPE is latched a.<i SPL~r., 
which is used to ensure that no RR is generated, when a spare cell detects a. self-
fault. The RR~:~1 •1 input of switches, Sm+&,J (0 ~ j < n) is pulled low and it is 
58 
r--~-----------------------------------, 
I I 
~-----,-------t'l_. se/edo;1 
t---~ FF 
r----------------' .. RRS•+I.J+I PE, ,1 
I ______________________ _ _______________ J 
Figure a.2:J: Control Circuit of PE ( P E failure algorithm ) 
·--------J' 
RR1?::,,, 
TE,_,,1 -------------J 
FF - ,--
/l/?.~'~ 1 '1 I I I,J ~ I /Ul1,E:,+I., -PE, ,, 
* * * * * * * lllls •. ,+• it t + ll2 t'J it t + ll2 l:J PE,,1 
ll Rs, +, I.J+ I 
P£,,1 n 
A B 
Figure 3.24: Signal \Vaveforms (Output of the P E' s control circuit) 
59 
~ ·Y ·9 -{J Q 
r··r·······.::::'c§_-··· ····T·······.·.·:.·.'er···· .. · f· ···· · ···:::~cj·;~ .... i.......... _._._. .. b.... : 
l-~··· · · · ·~·-··:~·-····~··:~······~-·.:~·-·······.~·· ·:9 j~·····":·. A . ···· ·:-.. A . ····:·· . A ..... .. :·. t 
i ~ : ··:::y ~ ·:::y \ ··:::~~ ~· ::\=) 
r~·r. ···:~::.9···;· ·· ·· :·* ·-~·········~. · :*:9:' :* .. 
: ... : .·· ... : ... ..· : .··· ... : .. ~.J 
\ , .. / ~··· ·· , ....... i····· ; ....... ~..... , .. / t...... ~r 
: 0 ' : 0 ' : 0 SPE. : 0 ' : 0 
: SPE :. SPE . : SPE · ..
"' ,V ~ V V. FF ----------------,,----------------' 
RRs to central processor 
--- Vertical data ----- Horiwntal data ... .. ...... · control lines 
Figure 3.25: Reconfiguration Request Propagation (for PE failure algorithm) 
called RR!"'+1'1 • SP ELand R~m-u,, ensure that S:.!+t .i (O $ j < n) do not change 
state (the~e switches always remain in STJ1). 
The switches are finite state blocks as shown in Figure 3.26 and the states of 
the switches: depending on the RR lines, are shown in Figure :1.27. Case A shows 
the vertical switch state change for switches SlJ+t and Si~l.i+l in the ca.'fe of P Bi,j 
failure at t 1• At t 11 these switche~ go to ST.V and stay tlu!rc. Case B shows the 
vertical switch state changes for switches sr.,i+l (i + I < iz < m), in the cas<! of 
P Ei,j failure at t1• At t 1, these switches go to STt and come bade to ST[ at t + 1. 
Case C shows the state transition of switches sf:.i (i < iz :5 rn). These switches 
toggle from one state to the other at t 1 and remain in this state. Case D shows the 
state transition of switches Sf~J+t (i < ir ~ m). These switches toggle from one 
state to the other at t + 1 and remain in that state. 
In the next section, the full algorithm is detailed. 
60 
nltn··l R~·-•,J PEo-I,J-1 I,J ~---------------J~·--------------
RR5''1 PBo-I,J Or STf horizon. 
I: STf' awildl 
RR5''1 S,+I,J 
o: STav vertical 
RRs,,, 1 , sr1V awitch PBo,J-l 1 
RR5''1 ~'----------"' PBo-I,J-11 1 
L--------------------------·----~ 
Figure 3.26: Block Diagram of the Switch module (for P E failure algorithm) 
3.3 Operation of the Algorithm 
Consider a.n m x n array (with m active rows of PEs, 0 through m - 1, and n 
active columns of PEs, 0 through n -1). 
Initially, all the horizontal switches s~ and vertical switches SK; (0 :5 i :5 
m + 1 ; 0 S j ~ n) are brought to state STJI by applying a pulse at the global 
reset line. Then the static coefficients are loaded in the array by using vertical 
input and inputfcoef ficient lines as explained in the subsection 3.2.1 (PE;,; (0 S 
i < m; 0 :5 j < n) contains the static coefficient of PE{J in accumulator '0' and 
of PEf-1.; in accumulator '1'). Spare PEs, PEm,j (0 < j < n) do not have any 
valid data in accumulator '0' and accumulator '1' contains the static coefficient of 
PE~-tJ· 
Next, the vertical switches, sr; (m :5 i :5 m + 1 ; 1 :5 j < n) are brought to 
state STt by giving a pulse to the S.P E in puts of the spare cells. S P E gets latched 
as S PEL· This prepares the array for operation. 
When a P E;J (i :/: m) fails at t~t it issues RRs to S;,;+t, S;+tJ+t, S;+t,; and 
P Ei+tJ· After receiving this request P Ei+tJ generates RRs to switches and to 
61 
, . 
r 
CLJ\pE 
(no f&ilurel 
CLI\pE 
(with failure) --
A A A A 
I I I I 
t It t + 1 l2 
RR8'·' PEo-I,J~I-------------
vertical sw~_:h _j 
A 
__ _j 
horizontal switch OR 
--r.__ __ _ 
c 
A A A A 
I I I I 
t l 1 I+ t l ·z 
__ I I 
n 
B 
__ n __ 
__ I 
on 
J) 
Figure 3.27: State Transition of The Switches (for P B failure algorithm) 
62 
P EiHJ. [n this way the reconfiguration request goes from P Ei,j to the spare, 
P EmJ. If on its way it encounters a faulty cell, a. fatal failure occurs and the F F 
signal is given to the central processor. When a. switch, Si,jy receives RR~~.-1 ,1 _1 it 
generates nlfs::•·' to Si-~.j, which is used to decode the relative location of Sc-t,j 
with respect to the failed P E. 
Once the reconfigura.tion request reaches P Em.;, P Em.; generates RR~~=~1 '1 , 
which is given to the central processor. The central processor delays the next 
CLKpE edge, t + 1. Each PEzJ (i < x :5 m) generates the RRs to the switches 
and the switches are reconfigured in two stages: 
1. At tl' the vertical switches, s~j+l ( i ~ X :5 m- 1) are brought to state ST.V' 
S!:,j+l and S!:+t,j+l remain in STt and the horizontal switches, s:,; (i < X < m+ 1) 
toggle either from state STJ1 to STt or from ST1" to ST0H. At the same time, P Ez.; 
(i < x :5 m) '.!tart using the static coefficient latch '1' and select the horizontal input 
port 1/1E1 for use. 
2. At t + 1, the vertical switches, s~Hl (i + 1 <X ~ m + 1) change state from 
ST.V to STl and the horizontal switches, s:,j+l (i <X :5 m + 1) toggle either from 
state ST0H to ST{' or from ST1" to ST/!. This completes the reconfiguration and 
then the normal clock speed is resumed. 
3.4 Concluding Remarks 
In this chapter an on·line reconfiguration algorithm for P E failures was discussed. 
Here an extra row of cells (called spares) is provided to the array and in the case 
of a detected P E failure global shift is performed for the corresponding column. 
The staging latches were shifted from the input side to the output side to fa. 
cilitate the full use of non· faulty partial results. The PEs are provided with an 
additional static coefficient latch to avoid reloading of static coefficients in the case 
of a P E failure. The testing circuit and control circuit are added in the PEs to 
63 
detect the fault and generate the reconfiguration requests. In addition. the control 
circuit selects the proper input data ports after the rerouting. 
The network is modified to support the algorithm and switches arc designed as 
finite state machines. It was proved that the reconfiguration requires a maximum 
of two stages of rerouting and the proposed two states of vertical and horizontal 
switches provide the required data paths. 
It is assumed that the control circuit of the PEs never fails, it is cssenti<tl t.u 
ensure the proper operation of the algorithm. Failure of control circuit may lead 
to an unsafe fatal failure. To achieve this feature, control circuit can be provided 
with active redundancy. The assumption of sequential failures (faults occurring one 
at a time) is made to simplify the modelling of the algorithm. This algorithm can 
tolerate simultaneous multiple failures if the failures are not in adjacent columns. 
It is assumed that the occurrence of a failure is reported to the central processor 
before the arrival of next clock edge. This can be ensured by making the clock period 
slightly longer. The time between the occurrence of a failure and fault reporting 
depends upon the number of rows in the array. Therefore for an array with small 
number of rows the speed reduction due to extended clock will be very little. 
The above algorithm is modified to accommodate the link failures too and th~> 
modified algorithm is discussed in the next chapter. 
64 
Chapter 4 
ALGORITHM FOR PE AND 
LINK FAILURE TOLERANCE 
The basic principle of this algorithm is the same as explained earlier: a bottom row 
of spares is provided to the array of size (m x n) and if P EiJ fails, P Es,j is replaced 
by P Em,j if P Em,j is available. 
A link failure for the link L~~:~ .. J is detected by P Es,j by using parity bit checks. 
To tolerate the link failures, each link is duplicated. 
Here, the following assumptions are made: 
Assumptions: 
• the faults are occurring one at a time; 
• the link failures are detected by PEs (here even an intermittent data error is 
taken as link failure); 
• switches perform self-test only for the control circuit (they do not test the 
actual ~witching circuitry because any fault in a switching circuit results in a 
data. error, which is detected by PEs); 
• once a P E fails, it detects the self-fault; 
• the control circuitry of a P E never fails; 
• the self-testing blocks of PEs and switches never fail; 
65 
• a central processor provides input and dock to the array and it receives output. 
and fault occurrence signals from the array; 
• the occurrence of a fa.ilure is reported to the central processor before lh(: 
arrival of next rising clock edge and 
• the central processor provides clock pulses (CLKs) to switches also, if any 
PE fails. 
4.1 Data Routing 
As explained earlier, the algorithm (for PE failures) needs one vertical and two 
horizontal links between PEs; consequently now two vertical and four horizontal 
links are provided (link redundancy). The vertical and horizontal data routing!! a.rc 
discussed separately in the following subsections. 
4.1.1 Vertical Data Routing Path (for PE and Link fail-
ures) 
The network for vertical data is shown in Figure 4.1. 
The input links from the central processor and output links to the central pro-
cessor are also duplicated. Each P E has two vertical inputs UXEo and Jtl!:t) 
and two vertical outputs (O~EO and o~E.) as shown in Figure 4.1. Similarly, 
each vertical switch is a 3 x 3 switch (with three input.s, /f0 , lffu IX~ and thrt>c 
outputs OX0, OX1, 0~2). To support the algorithm, a total of eight states of the 
vertical switches are provided as shown in Figure 4.2. Initially, all liWtt.ch<~s 8i~i 
(0 ~ i ~ m- 1; 0 ~ j ~ n) are in state STl, switches S~.i (1 ~ j ~ n) a.re in S'l't 
and switches S~+l.i (1 $ j ~ n) are in state ST2v (sec Figure 4.1). The switches, 
s~.o and S!:+I,O are in sr:. The vertical input to the array is appli~d through lhe 
Ira and Jf2 ports of the switches So,j (0 ~ j ~ n). The output Oj appears at 0~0 
of Sm+IJ+t (0 < j < n) in the case of no output link failure. 
66 
PE INPUT/OUTPUT 
0~ or or or 
SWITCH INPUT/OUTPUT 
Figu rc 4 .1: Vertical Data Path (for P E and Link failurcu) 
STY 0 STr STi ST[ 
c;rv 
... 4 ST[ ST[ srr 
Figure 4.2: States of the Vertical Switches (For combined PE and Link failures) 
6i 
'· 
I 
,, 
I ll so 
I ll S\ 
u o'! .. /p E:t I /~.\ !J!r::'l~~/ Of!,_.~ 
I ll ../ o" /'El l'l~'t 
/ If Qll I'EO l ' l~'u 
0 11 so 
0 11 
.... 1 
o.~!.J 
o~, 
Figure 4.3: Horizontal data Path (Combined PE and Link failure) 
4.1.2 Horizontal Data Routing Path (for PE and Link 
failures) 
The network is shown in Figure 4.3. The horizontal inputs and outputs of l.he PBs 
and switches are also shown in Figure 4.3. Each P E has two pairs of horizontal 
inputs (IJ!Eo 1 If!s1 and I/fE2, If!s3). To support the algorithm, a total of two sta.tcll 
of the horizontal switches are provided as shown in Figure 4.4. Initially all the 
horizontal switches are in state STJ1• 
STJI S'J'fl 
Figure 4.4: Horizontal Switch States (Combined PE and Link failure) 
68 
s . m,J 
or or Of 0~ 
Figure 1.5: Switch State Changes (Combined PE and Link failure) 
ln t.hc next section, handling of link failure is explained. 
4.2 Handling of a J..Jink failure 
Normally, P E;,1 processes the data available at its /~Eo and if! Eo ports and when 
it detects a fault in the data, the P E selects port IJ!Et for horizontal input (in 
t.hc case of a horizontal data fault). For a vertical data fault, it checks the switch 
srJ+l and if siJ+I is in state S1't' it selects the vertical input port J~Bt· For an 
output Oj fault (as stated earlier, the central processor detects this fauiL), switch 
S~+l.itl is checked and since it is in state STt, it invokes a reconfiguration of 
vertical switches because here, it cannot use the data available at /~El · In this 
" 
ca:;c, S,~+t?,Jtt and S~.i+l change the states of S~+t.i and S~.i from ST: & STt 
to ~n:,V and from STJ' & STt to S1j respectivc•ly. These switches .5'~+tJ and 
S,~,J change the states of S~+l,j-l and S~J-l again and so on, until S~+L!I finds a 
switch 8~+1 ,11_ 1 in state ST;' or in ST[ (here the algorithm assumes that though 
t.he link L~t::!_, , r-' is faulty, the switches S~,11 , S~+l .11 and link £;::1' 11 may not be 
faulty). 
lf in the array, shown in Figure 4.5, or fails, the algorithm changes the states 
of s~+l,2! s~+l,l and s~+l.O to ST[ and switches 5~,2! s~ .• and s~.o are brought 
,. v 
to state ST3 ' and outputs Oi (0 ~ j ~ 2) are taken through the second output 
69 
.. 
l 
.' 
. 
·' 
port. 
Next, when or bccumcs faulty, it changes the stalt'S of S,~;t·I.J l:! < .i ~ ·I) '" 
""'Tv d .,\' (" · ) ('07•\' d \' ~ 4 an ~m.j - < J :S ·l to,::. 3 an outputs 0 1 (~ < j :S ·I) an• takl'n thruu~h 
the second port. 
The PE failure and the link failure algorithms arc combined in t Itt• llt'XI. !it't'l iuu. 
4.3 Combined PE and Link Failure 
Here PE and Link failures arc discussed separately for the sakt~ of darit.y. 
4.3.1 PE Failure (in presence of faulty Links) 
As explained earlier, a P E failure is handled in two stage:;. 
Vertical Data Routing - A p Ei,J failure affects the stales of swit.clws s·;:J t I 
( i $ X $ m + 1 ). These switches can be in any of the sl.<ttes S'l;~·. S'(~ ' ' srr 
and srr depending 011 the or.currencc of earlier faults (they l'illlllot. IH~ irt sl.al.f'S 
STJ' and ST[' because these states can be reached only if then! is a failt!d /' 1~'1 • .) 
(0 $ x $ m- 1), in which case the algorithm fails now due t.o the JJoH-availahilit.y 
of a spare cell). 
When a PEi,j fails at. t 1, it. starts the re-rouliug, which is done in t.wo st.a~t !S. 
The r.hanges required for lhe iul.crrncdiate stage and the final si.Hge aw lisl.1 ~d lwlow: 
Intermediate stage- all swit.cbcs s;,J+l (i :::; X :::; TTL + I) chauge sl.itl.l! "' ' JH ! rtditt~ 
on their current state. Si~+l changes from 5'7'~ to STt or from S"J:Y l.o .',"/ ~:' . Jf 
it is in ST;'' it stays in S'Tj . Other switches s;:j+ I ( i < X ~ m + I) dlaltgl: fl'olll 
ST;{, sTr and STi to STi I from ST'j to S'lri and from .S''/~v l.o s'l:.Y, If s:,j-t I 
is in ST[, it stays in ST[. When a switch sg.i+1 (il > i) is iu stat.!! .cn:1v or irt 
ST;', the partial result of the P Eit-2,; docs not reach P Bi1 ,; hy ahove c:!tanges. 
So when a switch si\;.i+l is in state STt, the algorithm checks t.he swit.dti!S s.~ .r. 
(0 $X$ j) and finds a switch sg,jl (Q :5 j} $ j) (nearest to 5'1~,Jtd' whir.b is 1101. 
70 
in S1i (when s.~.;+ I is in S1~v' t.hc algorithm checks si\;,j and if s,v;,J is in S7~\' ' the 
algorithm checks 8,~ .1 _ 1 • Again , if S.~.1 _ 1 is in srr, the algorithm checks S,\.1 _ 2 • 
In this way, the algorithm goes towards S,"';,O and finds sx.)l, which is not in 57~\'). 
L'.!mma 4.1 - The switch S,~,11 cannot be in S1i, STj, ST[ and S1~'; so it can 
he only in one of t.hc states STci, ST.V, sr.r and ST[. 
Proof- If a switch Si~.1• is in S1~v, the algorithm checks Si~.iz- 1 as specified in 
the algorithm and S',~;,Jr is IIOt defined as sg,jl' 
When a swit.ch s.~.1 .. is in STj or sr:, it means that Si~.iz+l is in ST." and 
tflcn S',~,j~+l Would be taken as Si~,JI and the algorithm Will not check Sj~;.iz' 
The switch. S,\;,11 cannot be in srr, because it is assumed that failures occur 
one at a time (and a switch can he in ST;' only during the intermediate stage of 
re-routing). 0 
Corollary 4.1.1 - The switch, S,t1•11 cannot be in states srr, srr and ST;'. 
Proof- When a switch, sJ;_ 1,1I is in STJ' or S7f, it means that Si~-l.iz+l is in 
STi and then Si~.1 .. +1 cannot be in ST{' (as will be shown in Lemma 4.2). So, here 
the algorithm will take Si~.iz+t as SX.it and it will not check Si~.iz' This proves 
that the switch, sg_ 1,11 cannot. be in STt or in STt, sg_t,jt cannot be in ST1 
hccause only one failure occurs at a time. 0 
Corollary 4.1.2 - When the switch, sK.jl is in STri, SX-t,jl must be either in 
S'J~' or in STi. 
Proof - If S,~_ 1 .; 1 is in ST;v, it would require Si~.it to be in STi, which is not 
possible. Similarly, Si~-I .il cannot be STj or ST[, because these conditions require 
Sil;Jt to be in S1'J. 
So, Si~-t,jt must be either in ST[ or in ST[. o 
Corollary 4.1.3 - When the switch, SK,jl is in srt, si~-l,jl can be only in ST[. 
Proof- When sg,jl is in sr.v, the switch, SK+l,jl would be in STi. So, Sft-t.jl 
cannot be in ST.V or S1;v because in a column only one switch can be in ST.V and 
71 
ST[ (as will be pro\'ed in Lemma 4.2). The switch S,\_ 1.) 1 cannot. be in s·(~ · or 
sr:'' because it would require sl;,jl to be in sr~·. 
S S\l b . ''7•\' o, i+IJl must c m ::_. 0 . lJ 
Corollary 4.1.4 - When tht.• switch. s.\.;1 is in srr or ill S'/~~ · . ·"''.t-1 ,; 1 lllliSI lit' 
. S'f.',. m 3. [J 
If the switch Si\;.; 1 is in state S7~· or ST,'', the swit.cht.•s S,\,11 (j I ~ !I s j) 
change state from S'l~· and STi' to ST~· or from 87\v to ,'-i'/r ;uul swit.du·s S,'; _1, ,1 
( 'l < < ') h f c-7•V . I ..,,1,\' '-"J'V f "/'\' "/'\' \ J _ y _ J c angc state rom .::" 0 anu :::t 1 t.o ,} 3 or rom .':i .1 l.o ,"i ,; • 1 I. 
h \' t c same time P E,~,11 (j I < !J ~ j) start using the sccotul \WI.ical input port 1 1 • 1~ 1 • 
If th 't h l"\' • C"l'',. L''J'V ( • J co\ ' • • e•/•\' C SWI C .::'lit.;! IS Ill slate ,-, ,1 or.::" :; IIWC\1\IIIg!. lilt. ._,,l,;l-l IS Ill ,, 'l 
and the link, L~~::~~.)l is faulty), t.hc switches si\;,11 (j 1 < y $ j) ch;Lngt• st.al.e from 
STi to ST;' and SWitches S,~-l.l1 (j} < y ~ j) change lO 8'/~'. JINI! it. i\SSIIIIII!S 
I' T:' 
that though the link L,,~::~~.)l is faulty, t.he switches s.~-1.;1+1• ·"'.\.;1 H and liuk 
L~::~:~:+t may not be faulty. At t.hc same time, Ph:, 1,1J (jl < y 5 j) sl'lt~d tlwir 
second vertical input port. If after sclcctiug the second port., iLIIY / 1H dl't.c·rls an 
input error, a fatal failure ocr.urs. 
Final Stage - At the next clock edge, t + 1, the algorithm clocs t.lu~ followiu~: 
It changes s;:i+ 1 ( i + 2 $ X ~ m + 1 ) from S'l'i' t.o ST~ I r rolll ,"J"/~~, l.o ,t.,"/ :r or 
from STl to STl and in the case of a change from ST[ t.o S'l-;' of swit.ch s.~ .)t I' 
if il > i + 1, the SWitches Si\;, 11 and Si~tl,y (y WrlS defined earlier ill t.he int.c:rmediat.c! 
stage) are brought back to the states in which they were hcforc iul.«!rntedial.c! re-
routing. S]J+l does not change state during the final stage and si~I,J+I dmnges 
state to STi if it were in STi. 
Lemma 4.2 - In a column j, only one switch Si~ can be in slat.c STt. 
Proof - Switch S'G goes to state STt if and only i£ vertical data routing pa.t.h 
requires bypassing of P Ei,J-l • 
72 
When column (j - I) of the P B3 does not have any faulty P E, the spare cell, 
P Em,1 - 11 is bypassed by bringing s::a.i to STt and s::a+t.i to ST.t 
When column (j - I) of the PEs has a faulty cell, P Ei,j-l, then this P E is 
hypa<;scd hy bringing 8,~1 to S'l't and sr+-t,i to ST[. ln this case the spare cell, 
P/~',.,1 _ 1 becomes an active cell and switches S~.i and S~+t.i go to ST0\'. [';ow, if 
another switch, S;~.i is in STt, it means that P Eit,i-t is faulty and it implies that 
mlumn (j- I) of P 1.;8 has two faulty cells. Since the algorithm can tolerate only 
one PI~ failure in a col11rnn. this condition leads to a fatal failure. 
So, in a working array, only one switch in a column can he in ST(. o 
Corollary 4.2.1 - Inn column j, only one switch can be in state ST2v. 0 
Corollary 4.2.2- lu a column j, only one switch can be in state STj. 
Proof- As shown in Lemma ~1.2, column (j + 1) can have only one switch, sti+t in 
state STt. Jn this case, 5,\:t.j+t would be in STi and these two switches provide 
the link t/?~•+J.t+l ( t' I d t tl b t PE d PE PE . 1 ~:,,_ 1 ,1 _ 1 ver .tea a a pa. 1 c ween i-t,j-t an i+l.i-1; !.i,j-l ts 
hypassrd). 
At this stage, when P Bi+J.i- 1 detects vertical data error (vertical input link 
failure}, thl.! algorithm hrings Si~ to STj (which will be discussed in the next 
subsection - link failure handling). 'When sri goes to S1'j, S4. 1J goes to ST.[ and 
these two provide an alternative data path between PEi-I,j-1 and PEi+t,j-t· Since 
in a column (j + 1 ), only one switch sri+ I can be in STt, column j can have only 
· h sv · -.Tv one swttc i,j m ~ 3 • 0 
The following three corollaries can be proved similarly. 
Corollary 4.2.3 - In a column j, only one switch can be in state ST[. 0 
Corollary 4.2.4 - In a column j, only one switch can be in state ST'[. 0 
Corollary 4.2.5 - In a column j, only one switch can be in state ST'[. 0 
Lemma 4.3 - In the event of an active P Ei,i failure, switch Sl:i+l can not be in 
state srr t before the reconfiguration starts. 
73 
f• 
j 
i ,. 
) 
(. 
l 
! 
>, 
, 
'• ,. 
i: 
Proof - Since before the rcconfiguration starts. P /!.',,_, is art in• and it. i~ p;t•l tin~ 
data from PEi-l,i and pro\"iding the partial results t.o Pf~+tJ· lht• switrh, ,o;,·,,Jtt 
cannot be in sta.te STt". 0 
The following corollary can be proved similarly. 
Corollary 4.3.1 - In the case of an acth·c P E,,J failure, swit.ch S,1~,,J .. 1 ciln not ht• 
in S1i', before the reconfiguration starts. [) 
Some rcprcsentati\'l'S cases of P Ei,j failure h;mdling arc disnlsst•d lll'Xt. (st•t• 
Figure 4.6). 
Case-A shows the array rcconfiguration, where the column (j + 1) of switdiC's 
provides all data links required hy the intermediate and final slag<$. Here, bdort~ 
th f 'I th 't I .:-V ( . < . < ) . "'1'1' (>\" • • L"/'l' I e as ure, c sws c tes~ ..:o,~,J+I 1 _ lr nz arc 111 .J 0 , ''"' •Jt-l IS 111 ,, 1 i\llt 
Sv .. ST.'' m+l,j+l IS In 2 · 
When si~j+l is in stale STY (Case-B), then also all the paths a.re made il\'ililahh~ 
by changing si~J+I to S'/~~ and ot.her switches to STi for t.he intermediate Sl.4tge and 
then by changing s~I.J+I to S'l~' from S1~' and other s~J+I (i + ~ ~ r $., + I) 
from srr to S1'd for final stage. 
B h t coV ') • • "'J'\' I I / P/~·, 1 1 ut w en a switc 1, ,:~ii.J+• (il > z 1s sn stateS .1 , t U! pal. 1 , 1,1::,,·_1 •1 t:ca1111ol. 
be provided by the switches in the column j + I. So other switches in row i and 
i - l ae modified. 
In Case-C, P Bi,i fails and the intermediate stage requires a link hdwem1 JJ l~'is-'J ,1 
and PEi1,j, wbkh cannot be provided by the switches in the (j +I )lh c:olunm. So 
the algorithm finds sr.,Js in st.atc STci. Due to an earlier PE.s-l,i-1 failure~, s·,~-t ,J 
is in ST.V and Si~.i is in STl. For intermediate stage routing, 8,~_ 1 ,1 _ 1 and S,~- 1 ,1 
are brought to STi and sK.il and si~,j arc brought to S1~v. 
Case-D is similar to Case-C, but here 8{;,11 is in ST.V, So, Si~,; 1 is chang!!d t.(J 
ST[. The other changes are the same as written for Ca.c;c-A. 
Case-E is another variation of Case-C, here sr. -1 ,jl is in STl. sr. -l,j I is changed 
~· S,I,JI 
Case-A 
Casc-C 
Casc-E 
Normal route 
Case- B 
v 
s, ,J+I 
Case-D 
Case-F 
- -- - Intermediate stage rerouting 
Figure 4.6: Vertical Switch State Changes (Combined PE and Link failure) 
75 
' 
f. 
. . 
l· 
'· 
r. 
'· 
r 
·, 
to ST'[ and other changes are same as written in Case-A. 
In Case-F, 8;~;_ 1 ,11 is in S1~' and S;';,J, is in srr. Here S'at-t" is rhanw·tl tu 
C'T'1 d .... v . h d C'1''' 'l'h . I "'" J ,,. . .. ,., . 
..:- . 3 an ,:~it,j ts c angc to.:~ 4 • c swttc tes ._ 11 _ 1•11 an ~.t.;t arc 111 slal.t•s ."1 :1 
and ST,Y because P E~;_11 detected a link failure earlier (and P Ett-t.;H· t was faulty 
that time). The latest rcconliguration assunK>s that though tlw link L1,:r.,:,., .1 _ 1 is 
···-~ -J-1 
faulty, link L~;:~I.J is not faulty. In the case of faulty /.~::~ 1 •1 it fatal failure ucrms . 
If the links, which arc newly generated by using the switches of colunllts ir 
Ur < j), are required by the final stage (when il = i + 1 ), t.hc algorit.h111 doc's nul. 
change the states of the switches generating these links. Ot.ltcrwis<· at. t + I, t.ltc•sc~ 
switches go back to their prior-to-l 1-statc. 
Horizontal Data Routing -This is exactly similar t.o the horizout.al datil routing 
explained in the previous algorithm (only PE fail algorithm). The rcnlllfigura l. ion, 
invoked due toP Ei.i fail11 re, changes the states of the switches sf:.; ( i < ir ~ m-1- I), 
either from ST/' to 51~1 or from S'I'J; to STl1 during t.he int.crnwdiat.e sl it!!.<' of 
rerouting. During the final stage, the states of the switches, s:!.}t I ( i < ir ~ 111 + I) 
is changed, either from ST// l.o STl1 or from S7'111 to ST// . 
4.3.2 Link Failure (in presence of faulty PEs) 
As explained earlier, each link is duplicated here. Whcrt aPE dc!l.c!d.s an c ~ rror in 
the data, available at its first input port, it invokes a rcconfigural.iuu aud sd•~ds 
the second input port. If a P E is using second input port, input data error lmuls 
to fatal failure. The vertical and horizontal data paths arc discussed fleparat.dy. 
Vertical Path - When a P Ei.i detects a fault in its input data (at port IY,1~0 ) , it. 
does the following: 
• if SlJ+I is in STo, STJ or in 81~', then the PE;,j simply selects t.hc other 
input port (/~Ed, 
76 
'· 
• dse if St1+1 is in ST{' , ST[ or in ST.;', then the algorithm checks s .. ~~ and 
if it is in :n;v, the algorithm checks S;j_1 and so on, until it finds a switch 
s.~z: (0 $X < j ), which is not in state STi (here, srr would be in one of the 
slates S7'J', S'ft, 5'1~v or STt, lemma 4.1 shows this) and then: 
- if S',~r is in 87~', it changes all S;'-: 1,11 (x $ y $ j) from STri to ST[, 
from ST.;' to S7t or from ST[ to ST[ and changes all sr11 (x :S y :S j) 
to STt from S'i'J' and ST{' or to STt from ST.;' and select input port 
IY,EI for all P Ei,y (x :5 y :5 j), 
I 'f sv . . "1'v . h sv ST.v sv S1'v d h 
- c se 1 i,r 1s 1r1 ..J 1 , Jl c angcs i-l,r to 3 , i,:r to 5 an c anges 
all S.."- 1,11 (x < y $ j) to S7'[ from STt, to ST[ from ST!( and al1 s .. ~11 
(x < y :S j) to STt from ST[, to STt from ST.;' and select input port 
IY,E1 for P J;;i,11 ( x $ y $ j), 
- else if Si~r is iu state STt or in ST[, it changes S.."- 1,11 (x < y $ j) to 
ST[ from S"Jt, to S1'J' from STi and all S[11 (x < y $ j) to ST.;' from 
ST{', S1~' from S''/~' and select input port I~Et Cor PEi,11 (x < y $ j). 
Some link failure rcconfigurations arc shown in Figure 4. i. Only the paths, 
which arc modified, arc shown. 
In Case-A, link failure is detected by P EiJ, but since the switch S/j+L is in sr:, 
no rcconfiguration is done, P Ei,j simply selects the other input port. Similarly, in 
Casc-B, Si~+t is in s1:r, ~o no re-routing is done and the second input port is 
selected. 
In Case-C, link failure is detected by P Ei,i and the switch SW+I is in ST[ due 
to the earlier failure of P Ei-i,i· Now the algorithm finds Sw_1 in STJ' and modifies 
st .. 1.i-u S[1.i to STJ' and Sw_1, Si'J to ST[. 
Case-D is similar to Casc-C, but here S}'_1,j-t is in ST[, so it is brought to 
('tr.~' 
" 6 • 
77 
A B 
c I) 
)~ F 
Initial path ----- · Changed path 
Figure ·1. 7: Link failure Reconfigurations 
~: 
78 
In Casc-E, PE,,1 detects the link failure and stj+I is in S1~v, so the algorithm 
finds .'ii~-t in ST.V and changes it to STt. The states of s,v_t,)-t' SY.. 1,J arc changed 
to STj and st, is dtangc!d to S1~v. 
In Ca.o;e-F, S,J+J is in ST'[ and SG-l is found in ST.,.V. Here. SiJ is changed to 
S'Jt and s,v_ 1,J is changc!d to STJ, It is assumed that though the link L~f:::~:-t 
is faulty, the link, L~:~'·1 is not faulty. If this link is also (aulty, P Ei,j would again 
detect vertical input. error and it. would cause a fatal failure. 
Horizontal Path -When a PEi,j detects a fault in its horizontal input. data, 
iL JWrforrns the following operations: 
·r· .. Jll. I ,u 
• 1 1t IS usmg Ph'o• 1t sc ccts I'El, 
I . r . . . ,u . I Iu • c se r 1l ts usmg PH'2• 11. sc ccts pr;3 , 
• otherwise the algorithm fails and fatal failure occurs. 
Theorem 4.1 -The cases shown in Figure 4.6 represent all the possible combina-
tions of vertical swit.c:h stat.cs, in the case of P Ei,J failure. 
Proof- The rcconfiguration, in the case of P EiJ failure reroutes the vertical data 
hy changing the states of the switches in column (j + 1). The states of the switches 
in column (j + l) depend on earlier PE and link failures in column (j + 1) of PEs. 
There arc only four combinations of failures in column (j + 1} of PEs and these 
arc I is ted below. 
1. Column (j + I) of P l~s has no faulty P E or no faulty link, 
2. column (j + 1) of PEs has a faulty P E, 
3. column (j + 1) of PEs has only link failures and 
4. column (j + 1) of PEs has both P E and link failures. 
i9 
The effect of each of these failures on the rcconliguration (which i~ inmkt•tl dut• to 
PE,,1 failure) is discussed separately. 
Column j + 1 has no faulty P E or no faulty link • ht•n•. <'olumu (j + 1) 
of switches wonlu be in its initially set state and this n>rn~spo!Hls to Cast•· t\ of 
Figure 4.6. 
Column j + 1 has only one faulty P B • here also, <·olumn (j + I) of switdws 
would be in its initially set st.atc and this corresponds to Cast•-A of Figun• -Ui. 
Column j + 1 has only link failures - here the output links may lH' or may uut. 
be faulty. When the output link is not faulty, column (j +I) of swit.dtt•s would uot. 
be disturbed by these link failures and this is covered iu Case-A of Figun• ·Ui. 
When the output. link is faulty, s~,j+l would he in S1'j and s~;+l.) H woultl 
be in ST4v. In this case, t.hc link l.P1JC (the link thnt. carrif!s the vertical result. Cm - l,J 
to the central processor from P Bm-l.j) is not provided hy the swit.ehcs in wlunm 
(j + 1 ). Here, all switches. s~:+t.j~ (0 ~ ix ~ j + 1) woulcl he in S"J::' (as c~xplailu!d 
in the subsection 4.2). This corresponds to Case-F of Figure .J.O. Jl,!re j I = j, so 
no switches would change state. The only difrcrcncc here is that. for Case- F, it. was 
assumed that though the link, L~:~~::~: '11 is faulty, link 1-~::~: .11 is uot. fmllt.y, hut. 
now this assumption is not. required because S~+I,J did not reach STY dnf: to t.hf! 
link failure detected by column j of switche:-:. 
Column j + l has both faulty P B and faulty links • in this cast~, tlw failnms 
affect column (j + l) of switches only if PBi2,J+I fails and verLic:nl i11p11t. error is 
detected by PEi2+t,j+1· It brings si~,j+l to STj and si~+I,J+I tu srr. 
When ( i2 < i- 1 ), the switches, sl~oi+ l ( i ~ iz ~ rn + I) arc not afr<!CLI!d by the 
above mentioned faiJures. This condition corresponds to Case-A of Figure tJ .(;. 
When (i2 = i- 1), the switch, Sl'-t,i+• would be in S'lj and S1~i+J would be in 
ST[. This condition corresponds to Casc-B of Figure 4.6. 
80 
When (i2 > i- I), cmc of l}w links required by the intermediate stage (namely 
link L~~:~~::~) would not be provided by the column (j + 1) of switches. Here, 
switch S,~+IJ+I is l'cnarncd as S'i~,j+1 for the sake of clarity and it can be either in 
S'lt or in s1:r. Once the algorithm finds Si~.i+l in ST[ or in STt, it checks S;~.i 
and if it is in ST{', the algorithm checks 5',~,1 _ 1 and sn .,n, until it finds a switch. 
Si~.1 ., which is not i11 S1~v. Clearly all switches, Si~,1, (j 1 < ir ~ j) would be in 
S'f'[ and S,~-t.i6 would he in STt. The switch, S~.i 1 can be in ttny of the states 
STri, ST.V, S7'[ and S7~Y (as proved in Lemma 4.1 ). 
When S,~.Jt is in 87~, Si~-1.11 can only be either in STri or in ST{ (as proved 
in Corollary 4.1.2) a.nd these two conditions correspond to Case-C and Case-D of 
Figure 4.6 respectively. 
When 8i~.} 1 is in STt, Sil;_~,11 would be in STJ' .(as proved in Corollary 4.1.3) 
and it corresponds to Case- E of Figure 4 .6. 
When S;~.it is in ST.Y or in S7'[, S;~-t,jt would be in STj (as proved in Corol-
lary 4.1 A) and it corresponds to Casc-F of Figure 4.6. 0 
Theorem 4.2 - The cases shown in Figure 4. 7 represent all the possible combina-
tions o~ vertical switch states in the case of a vertical link failure detected by P E,,j. 
Proof- When a vertical link failure is detected uy P Ei.J, the reconfiguration de-
pends on the state of switch srj+l. It can be in any state depending on the occur-
rence of earlier faults. 
When sri+t is in STJ', STJ or in SJ:Y, it means that P Ei,j is receiving the 
input from PEi-lJ and it corresponds to Case-A of Figure 4.7. 
When S~+t is either in STt or in ST[, P E,,j is faulty and no reconfiguration 
is invoked. 
When St'J+l is in S1;v I STJ' or STi (here it can be in srr because the inter-
mediate stage of rerouting always instructs the PEs to use the switches of column 
(j + 1) and it may have faulty links before the failure of P Ei,j ), P Ei.j receives 
81 
vertical input from P Ei-2.j and in this case an alternative vertical data path is 
required. For this, the algorithm checks the switch. S~.~ and if it is in ST]'. t.lw 
algorithm checks Si~J-t· If S/.}_ 1 also is in S1 ;', the algorithm dterks 8~.~- 1 and 
so on, until it find a switch, S~;j,, which is not in 5'11' (Si.;t can h<' in '"'Y of t.he 
states ST;', srr, sr.r or S'l~·, as proved in Lemma 4.1 ). 
Wh Sv · · ST.v .:..·v 1 . h . S'l'\' ' co'l'\' (C II ) en iJl ts m 0 , ,:~i-t,;l can >c etL cr m . 0 or 1t1 '' 2 .oro ary ·1.1.'2. 
and these conditions correspond to Case-C and Case-D of Figure ·1. i I'I'SJH'd.iwly 
(switch, Si~+ 1 is shown in ST{'; when it is in S1'[ or STf', the chauges would ht• 
t.he same). 
When Si~t is in STt, st_t.11 would be in S'l;r (Corollary ·1.1.:1) ancl it. c:orre· 
sponds to Case-E of Figure 'l.i. 
When SJ'J1 is in STY or in ST.i', sr_,,i1 would be in S7j (Corollary ·1.1.'1) and 
it corresponds to Case- F of Figure 4. 7. 0 
In the next section, a scheme for implementing this algorithm is propos<!cl. 
4.4 Implementation 
A scheme is proposed here to implement the above algorithm for c:ombi11cd P J!) 
and link failure handling. The proposed scheme uses an external clock ( G' /J A's) for 
the switch state changes in the case of P E failures. Rcconfigurat:ou in t.hc cas<! 
of a link failure does not need any external clod., but when Uu!re i~ a I' e f<Lilure 
at t1 (see Figure 4.8), two clock pulses arc provided to the switches and t.hc next. 
clock edge t + 1 is delayed. At l3 , the first clock edge is applied to t.bc switches t.o 
complete the intermediate stage of reconfigurat.ion. The on-time of t~ depends 011 
the time taken by the PEs to check their inputs. If input checking time ill lc, thcu 
the on-time oft., L60" = lc + 6t, where 6t depends on RR- propagation time ani! 
switch settling time. This is done to ensure the proper routing of vertical data in 
the case of a vertical link failure detection between intermediate and final st.<Lges of 
82 
PE clock· 
(normal) 
PE clock 
(P E failure at 
tt) 
Switch Clock 
(due toPE 
failure) 
PE clock 
(Link failure 
at tt) 
A 
I 
A 
I 
t, 
A 
I 
A A 
I I 
t + 1 t+2 
A 
I 
Figure 4.8: Various Clock Signa.ls 
PE failure reconfiguration (due to an earlier link failure). At t+ 1, the second clock 
edge arrives to the switches and completes t.he final stage of the rcconflgurat.ion. 
The next pulse t + 2 to the PEs is also delayed to accommodate the switch sdtling 
time. 
In the case of a. link failure at t1, the next clock edge, t + 1 to the PBs is 
delayed and no separate clock is given to the switches. Various changes required 
in the network, processing element and switr.h module arc given in the following 
subsections. 
4.4.1 Network 
The network is made capable of: 
• informing the central processor of the occurrence of P E and link failure; 
• informing the central processor of the occurrence of fatal failure; 
• invoking the reconfiguration, 
83 
• providing the dock pulses to the switches and 
e initiating the switches. 
The central processor provides horizontal and vertical data inputs, P E-clock 
(CIJI\1,r:;) and switch-clock (CLKs) to the array and it receives rcconfiguration 
requests and output from t.ltc array. It provides a signal called S P E (Spare PE) to 
t.hc P E.'J of the bot.t.orn-most row. This signal brings the switches S~.i and S~+, .i 
(O ~ j $ n) to states S'l;v aud ST[ initially. 
The network is shown in Figure 4.9. 
4.4.2 Processing Element 
Various control and data lines for a processing element are shown in Figure 4.10. 
Each P E gets four horizontal inputs (If! Eo' IJ!EI, IJfE2 and lf!E3 ) and two verti-
cal inputs (JY,Eo and If,r:;1 ) . Similarly each P E has four horizontal output ports 
( oj!BOI o:!Et' O#E2 and Of!E3) carrying the same data and two vertical output ports 
(Of.Bo and 0Y,E1) carrying the same information. 
E h PE 1 • l RRPE; 1 d SVJPE;,J RRPEo.J • f.} Jac i,j gets two contro s1gna s, PE,~1 ., an s;,i+t. PE;-1,1 ts te 
rcconfiguration request from P Ei-t J. and SV J5PE;,, is the command for selecting I 0 1J+1 
the proper vertical input port. There is no such SV I control input for horizontal 
input selection because the horizontal input port selection is done by the P E itself. 
PEi,j issues various control signals (reconfiguration requests) to other switches 
and PEs. It generates LF,~~t 1 {link failure for vertical data) in the case of a O,J 
detected vertical input data error and it is sent to Si 1·+1• Another signal LF#sp is t , , , 
sent to the central processor in the case of an input error (vertical or horizontal) . 
The central processor delays the next clock edge, t + 1 to the PEs after receiving 
this signal. This delay time depends on the time required for RR propagation 
and switch settling time. If P E; 1· detects a self fault, it sends RR~·~+l to Si,J'+ll f tJ 
l~l~s·+•.,+l t C' l~R5'+"J t S d RRPE,+I., t PE 1. l.f'E 0 ~i+I,i+lt 1. pJ:: o itt,1· an PE o i+l ,J' • '·1 1.) t,] 
84 
It' It· It' 
0 I '.! 
' 
/ 
' 
/ 
' ; 
' 
I 
' 
, 
' I 
' 
I 
' 
I 
' I 
' 
/ 
' 
/ 
' )J ~ 0 Jll 0 o'' 0 
c(" 
0 ~ [II 0 ()II 1 /I 
SPE I V ; IV 0 SI'E ; I Sl• ~; 
' v v v 
"------------r----------' 
llR to the central processor 
llorizontalJJath · .... ... · > Control line 
-------- · Vertical path 
Figure 4.9: Network for Combined PE and Link Failure 
85 
I ll PE3 
[II 
PE2 
I I{ PEl 
/~EO 
' \ \ 
,/ I : 
. I . , 
IJ/Eo ...... / / f ~ 
RRS,+J ,, .t:.•' .• ·· ,' f i 
PE;,1 p ! l 
Ov :: 
\ ... 
\ ... 
\ ·· .. 
\ · .. 
' · .. 
\ ·· .. 
\ ···~ RRSi+J,J+l ~ PE;,j o~E. PEO , ..... : y 
SPE RKnE'+J,, 
PE;,; 
Figure 4.10: Processing Element Lines for Combined PE and Link Failure 
The block diagram of P Ei,i is given in Figure 4.11. If! Eo and If!Et inputs toP Ei,j 
come from Si+tJ and Jf!E2 and IJ!E3 come from Si,j (see Figure 4.9) and depending 
on the earlier reconfiguration either one of the pairs (If!Eo' If!El or IfJE2, If!E3) is 
selected by using MU X -C. Initially, P EiJ receives horizontal data from Sff_1,j 
using If!Eo input port and when it detects a. horizontal input error, its E1H (error in 
horizontal input) line becomes high and P EiJ selects If!El in place of If!Eo· Now, 
if PE,,j receives RR~~:~J,,, the Hs line is reset and H~ is made high, which selects 
port If!E2 for horizontal input. 
Once either I/fs1 or If!E3 is selected, Em becomes 0 because the new data 
are correct (it is assumed that only one failure can occur at a time), but Hs re-
mains high. Next, if the same P E detects another horizontal input error, it again 
makes Em high and the RR generating circuit {shown in Figure 4.12} generates a 
fatal failure signal. 
86 
I ll PE2 
I /{ PEl 
I ll PEO 
SPE 
I'' Iv c;ov IPE,,, RRpt.;,,, ~F J~' 1,;, , J 
- - - - P ~~ _P ~~ - -'"_ - -~ ·~ + ~ - - /:~-~ ·~ - - - - - - - - - - - - - - - - - - - - ~ ~ - ·:• :' ~ 1 
t I 
ll' s 
1 I 
LOGIC 
CIHCUIT 
T~tiug Hlock 
lis L-......;:;-----+--4-1 0/P K 
~ 
I 
I Em 
1..-----
SV l'Ei,) 
s•,J+• 
FF LFrffl 
I,J 
Erv 
CONTROL CIRCUIT 
- - - -I 
Figure 4.11: Block Diagram of the Processing Element (combined PI~ and link 
failure algorithm) 
87 
o1!t.::' 
o:!E·~ 
() 1! /~'I 
() f! /,'u 
IY,Bo and /~Et come to M U X - D and if vertical input error is detected. Ew 
becomes high and it is passed on to Si,j+t as LF;E,~ 1 , which changes the swit~n 
states, if required. /\s explained previously, a P E failure may also require other 
PE.<; to select their other vertical input port. This is done by the switch Si.i+l· 
S'i 1·+1 generates SV fsPt~ •• ,, which is used by the P E to select the proper vertical • '·1+1 
input port (when SV ~~~B,,11 is low, lftEo is selected and a high SV I selects lftEd . I,J+ 
If a vertical input error is detected while SV I is high, fatal failure occurs. 
When the testing block of aPE detects a fault in P E' s logic circuit, it generates 
an error signal, ELoGIC (this signal remains valid until the array is taken off-line) 
which is used to generate the RRs. 
After loading the coefficients, the central processor sends a signal, S P E (spare 
PE) t.o the PBs of the bottom most row (to spare cells), which is latched as SPEL. 
SPB makes RR~~.:; and RR~~:: .. 
1 
lines high for some time, so that CLJ\s brings 
s,~:.j and s~+l.i (0 ~ j < n) to STIV and STi respectively. s pEL is used to enSllr\! 
that no llRs arc generated by a spare cell, when it detects a self-fault or input 
error. 
The RR- generating circuit is shown in Figure 4.12. Various inputs and outputs 
arc shown in the block diagram and the timing diagram of the output control 
signals is shown in Figure 4.13. Figure 4.13.a shows the RRs generated by PEi,j 
wlwn it fails at time t 1 and Figurc,t.13.b shows the RRs generated by PEi,j when 
it receives RR~~:~,,, at lt due to failure of P Eit,j (il < i). \Vhen a. faulty P Ei,J 
receives RR~~:~'·', it generates fatal failure signal. 
l·'igure 4.13.c shows C Ll\pc, LFpcEp and LFpsfJ+ 1 in the case of link failures. 
•,J .,} 
At l11 a horizontal data error is detected and the central processor is informed but 
no information is sent to Si.jtt· At t2, a vertical data error is detected and both 
the central processor and Si.J+t arc informed about this failure. All RRs are reset 
at timet+ I. 
88 
~ 
•.· ·~· 
,. 
~~ 
•I 
t 
i 
l 
.~· 
·, 
I' 
lis Erv SV IPE •. J RRPE,,1 ELoarc S P E C L I\·,"~ 5,,,+1 PE,_,,, .-. 
----------- ____________ t ______ , 
1 I 
r 
Sf' Er, 
I 
L---
Figure 4.12: Schematic of the RR generating Circuit (combined P E and link failu re 
algorithm) 
89 
c L/(I'E 
(n" failure) 
c /.,/(pf~ 
(fllilurf!) 
_j 
nns;~··J+1 ----;------.-~------
/ E, ,, I --i A 
l + l 
A · P Ei,j fails 
C L[(I'E 
(no fAihare) 
C Ll\1~1-: 
fUnk railures) 
I I 
t + 1 
B · PEit ,J fails (il < i) 
Em __fl..__ ____________ _ 
Ew 
l.~F#C __j 
A 
I 
is 
C · Link Failures at t1 and t2 
Figure 4.13: Timing Diagram of the RR generating Circuit 
90 
4.4.3 Switch Module 
As explained earlier. each switch modult•, S,,J is a pair of switcht•s, ·"'!.~ and S,\,~· 
which are used to route the horizontal and vertical data rcspcctivdy. \ 'arious data 
and control lines for a switl'h module 8,; arc shown in Figurc·l. l·l. It ••t•ts /, /-'1'~'1:! ' n ... , ,,_, 
and RRfj;f: I from JJE.;-1 · HUI~·~- from P/~'•- 1 ;-1· l?lil8' 1·~ from/'/·.',_, ,; , t ,J- • ""•-1 .)-t ' ....... . J 
RR;:: •. , from si+l.} and uu~: :~+l from s •. ;+l a:; control inputs and hast•d Oil I.Ju•st• 
data, it changes the switch state and generates Rn~·-1.•, uu>~· and 8\ '!~·~~· .. ~- · . 
.._ ' •1 o. I .J • I, J 
If P Ei,; fails at. t 1 (sec Figure ·1.1·1 ), the central processor pro\'itl.•s t.wu c·lc .. ·k 
pulses to t.he switches by using a glohal switch clock line at 1, Hltd t + I. ,\t. 1 •• 
the intermediate stage of the rerouting is completed awl at. t + I the fi11al st.agt• is 
completed. 
The delay, t~-t 1 depc11Js 011 t.hc switch settling time and the lltlllllwr of columns 
in the array, because the /Vl.o; go from Stl,;+t to S, 1,0 (il > i), if a swil.dt 8,\;.1 ~ 1 
is in srt {as explained in tit·~ alp.oritltrn). Sirnila rit.y, (t + I) - t.. di'JWtlll:i 1111 t. lw 
P E processing time and switch settling time. The rU'Xl dor:k edge t.o t.lw /1 l~'.o;, 
t + 2 is also delayed to all(h'' fnr the switch settling t ime during the rinal stag•! of 
reconfiguration. After t + 2, the r·lock resumes its normal speed. 
If a link fails at t 2 , next clock edge, t +I to the PEs is delayed hy a pn!- spedfi•~tl 
time to provide sufficient time for llR propagation and switc:h sett.liug and afl.t~r 
t + 1, the clock resumes its normal speed. 
Vertical and horizontal switches arc discussed !icparalely. 
Vertical switch - It has two sub-circuits: the control circuit and the swit.chittg 
circuit. The control circuit. changes t}JC states of the switdu:s aud gcueral.1!s VilTi -
ous RR.s, while the switching circuit provides the proper input-output c:cmncdiorts 
based on the state-data made available by the control circuit. Various switch state 
changes are described in the algorithm and to achieve the proper r.hangt!s , t.lw hlor.k 
diagram shown in Figure 4.15 is proposed for the control circuit. For thr: sake of 
~)I 
-- '"\ 
I 
I 
1
r--1:{1 
~au so 
au Sl 
RRs., . PE·. 
o-I ,J 
RRSo,J+l 
S,,J 
au S2 
au SJ 
'- - - - ,.. Q ~l --~ Hori7.ontal Data 
- - -- -> Vertical Data 
· · · '· · · · · · ·> Control Line 
Controls and Data lines for a Switch 
c /; /(pf; 
(no railurc 
C L I\ n; n __ --J 
(with filii;,;) L-
C/.~1\s _jLfl._ _______ _ 
~ 
I 
i + I 
Switch Clock due to P E and Link Failure 
Figure 4.1·1: Data. and Control Lines for a Switch (combined P E and link failure 
algorithm) 
92 
clarity, various signals are renamed as shown below: 
A,= /lR~' -1 
-"• -1+1 Ao== 
H u~·-J-1 
~ .. ) 
13,= UR~'-1 
.S,+ I,J /Jo == R/r~·-··J .s,,1 
C= UH.'~·-~ I f-•-1.1-t D= nu~· ·~ I 1~0,] -• 
F-
---
l FSO,J 
J 1'1-:,_,_. X= SF/~,,~·,,_, ~' · ' 
Once a PEi.i fails at t I' vanous Hils arc generated. If a link fai Is, 1·.' arri\'c'S 
and it is latched as Et.· which generates an Ao signal, if s.l~ is in S'J:]' ur in ,o.,''J~·. 
EL is reset at the next falling edge of t.hl' switch clock (CL/\'s) for t.hc switch ,..,·,,1 , 
if sri is in the intermediate stage of rerouting due to P E,_ 1,1 _ 1 failun•. Ot.hc·rwisc· 
it gets reset at t + 1, when E goes low. 
AI appears at si,J• if Si,j+l needs the state change of SI,J• In the hlork diclgrarll, 
s'2s1s0 inform the present st.al.e of t.hc vertical switch. 
In the case of P E,,1 failure, the changes rt!quircd J,y the rt!configmill ion al~o ­
rithm depend on the index of t.he switch. These changes are lisl.c•d in Table! ·I. I. 
X represents that the switch cannot be in t.ltis state. No switch S,l:J+l c:an he iu 
states ST{ or STi' without <lll earlier P B failure in column j, whic:h causc·s fat.;d 
failure now. Similarly, si~~+l cannot be in ST, or S'/~v ( Lcrnrna ,1.:1), hl!c.lUISI! if 
there is no previously failed PE in column j, only 8~.1+ 1 would be iu .<:."J',v aruJ 
only S~+l.j+t would be in 8'11'. In this case, only the failure of J> 8,.,1 would find 
Sm.,j+l in state ST.V and since P Bm,j is the spare cell, no nownfiguratiou is iuvok•·•l. 
Similarly, Si+t.i+t cannot. be in S1'{' (Corollary ,1.3.1 ). 
When SK.i+l (il > i) (which is either in 8'/~v or in S''/~v) receives/), it. gc :uer:~l.•:s 
A0 • After receiving A1, Si~.1 changes state either to S'!f from S'l;Y & 81'!/ or to 
srr from ST1 and it issues 130 to sg_,,1, which causes s.~- 1 ,1 to c:haugc stat.e 
either to STi from ST;' & S1t or to 5'7~Y from S7f. If s.~.J is in ST{ I it agairt 
generates Ao and RR propagates towards S11,o in this manner tmlil a switdt, 8,~,11 
!JJ 
X 
C Bo 
r ----r -------------------JL -------------: 
A1+/Jr ~FL 
CLKs--£>o--U 
.----------, 
SEE 
EQUATION 4. 
Ao ~,.---< 
SJL 
Ar 
FF ~------~-----------+----------~~s2 
~-------~-----------~~ s, 
BJ.C.D.C Ll\"s 
Br.C.D.E 
CLI\"sFF 
Figure ·L 15: Block Diagram of the Vertical Switch (combined P E and link failure 
algorithm) 
9·1 
so 
"· 
i 
., 
,. 
~-
' t;. 
... 
is found, which is not in S7~'. sil.jl docs not generate Ao and at the Sc\Tl\e tillll' 
S;t,11 (jl < y ~ j + l) instruct P E;t,y-t to usc the second vertical input port. hy 
setting SF I signals . .Ao and 8 0 change t.he state of the switches without u~iug 
the external clock. This is the reason behind delaying (, after I 1• so that. all tht• 
switches, which need to generate .4o and Bo can generate t.hcse signals and A1 and 
81 get sufficient time to change the slate of the switches (because at. t. .. t.hc switch 
S;~.i+l changes state and Ao generated by this may not stay after t.hnt.). 
A1 and Br are latched as F'L at the falling edge of C/J/(5 ancl F1, is used l.o 
bring the switch in its prior-to-t 1-statc, if t.he final stage requires so. If t.lwse 
newly generated paths (gcncrn.tcd by A, and 81) arc required by the final stag•~ of 
reconfiguration (when i 1 = i + 1), S;~,j+t resets /\0 at t~ and F1, rcrna.ins low. 
Ar and 81 change the states of the switches according to Table •1.2. The rwxt 
clock edge t + 1, to the switches brings the changes, listed in Tahlc ,I.:J (t.hc st.at.es 
of the switches before t + l is t.aken from Table ~.1 ). 
At t + 1, switches having F'~., = 1 go back l.o their prior-to-1 .. ,-st.at.e. J\t ( t + I) .. 
the reconfiguration is complete a.nd all t.he nn.~ arc reset. 
In the case of a link failure communicated by P Ei,i to Si,j+l! Ao is gcrwra.t.ecl hy 
sri+t if it is in ST2v, ST[ or ST[. This Ao propagates towards Si,o and generates 
Bo as explained earlier, until it rinds a swit.ch sri1, which is not in S"!'J. 'l'lu: 
changes caused by A1 and /h arc listed in Table 4.2. 
Since the changes depend on the physical index of the switch in the cast! of a 
P Ei,j failure, 8 1, C and D arc used to decode the position of t.he switches. For 
S;,j+t, Brand Care low and Dis high (i.e. I:Jr CD is high). For Si+t,i+l, /Jr C lJ 
is high and for S~t,i+t (il > i +I), Br CD is high . 
Three J- K flip flops (s2, s 1 and Bo) arc used to store tltc current stat.e of the 
switch and 3 D·latches (s2L, s1L and sot,) arc used to store the previous stat.c of 
the switches. 
95 
,, 
. \ 
Table •1 . 1: State changes for Intermediate Stage ·' 
St.atc State after t .• " ., 
before l 1 
Si,i+t si+t,j+l Sit.i+t 
(it>i+l) 
0 1 7 7 
I X 7 .. I 
2 X X i 
3 3 6 6 
'1 5 5 5 
5 X 5 5 
G X X X 
i X X X 
Table ~1.2: State changes due to A1 and B1 
Stale before the arrival . State after the arrival 
of A1 and 81 of A1 and Br 
Due to Ar Due to 81 
0 4 3 
I 5 3 
2 4 6 
a X 3 
·l 4 X 
.5 5 X 
6 X 6 
7 X X 
Table ·L3: State changes for Final Stage 
State State after t + 1 
before t -1- I 
Si,i+l si+lJ+l Sit,j+t 
(it > i + 1) 
0 X X X 
l 1 X X 
2 X X X 
:J 3 X X 
·l X X X 
5 5 4 4 
6 X 6 3 
I X 2 0 
96 
For bringing the switches to their prior-to-t~-st.atc, resd and .-;cl input~ of the 
flip-flops are used. For all the other r.hangcs the switches art! clocked int.o t.ht'ir 
new states. Since the external switch-clock, C Ll\5 comes only in t.hl' rase l>f i\ 
P Ei,i failure and is supposed to modify the switches of column (.i + I) only, it i:; 
AND-cd with D. A, and 81 work as clock for all the switches Sit,Jt (jl -:f. j + I), 
So, D · (A1 + B 1) is delayed and OR-ed with /) · CDI\s t.o get t.hc fiual clock t.o 
the switches. D · (Ar + BI) is delayed l.o ensure the prcscnc:c of propr.•r input. al. 
flip-flops' inputs before the clock edge, C LKn.· appears. 
' If the first rising edge of C L 1\s is blocked from reaching t.hc .I - 1\ llip flop~ ~~,-..;: 
-~or the switch, S~J+t if it. is in S1f, t.hen various state changes (at. t.h<! clock edge) 
can be listed as in Table ·lA (for generating this tahlc, tables '1.1, tl.2 and tJ.:J aw 
combined). When s~.i+ I is in S7~v' the first edge of c L /{ s is hlock(~d hy usiug a. 
J- [( flip flop and two gales. This circuit allows only t.he second rising eclge of 
CLKs to appear as CTJKsn·, if t.hc switch is in S7t. 
At t + 1, FL brings the switches to their prior-to-l.,-st.at.e using :;d.-re.·H~ l inputs 
of the flip flops. FL uses the outputs of D-latchcs, S·2fJ, !:ltJ, and Hot, for hriugi11g 
the switch to prior-to-t,-stat.e. For t.hc ./ I<-rtip flops, .'icl-rc!sd inputs are lisl.c ~d in 
Equation 4.1. 
R2= FL · C LKs · s21., 82= n.l. 
Rt= FL . c L /( s . s lf,, SJ= n,, ( 'l.l ) 
Ro= FL · C LKs · SoJ. and S'u= u~)· 
The set-reset inputs go through tri-states to the flip flops and the l.ri-st.at.•~s an! 
activated only when these changes are required. 
Similarly, J - /( inputs arc derived and written in Equation 'La. 
97 
Table ,lA: State changes due to clock edge 
State before State after the clock edge 
the clock edge 
B1 CD 81 CD BI c D D AI DB1 
0 1 7 7 4 3 
I I 7 7 5 3 
2 X X 7 4 6 
:J 3 6 6 X 3 
4 5 5 5 4 X 
5 5 4 4 5 X 
6 X 6 3 X 6 
7 X 2 0 X X 
Table 1.5: State changes using set-reset inputs 
State State after the change 
fi/,.C LK5 A1.B1.s2.s1.so AI.BJ.s2.st.So 
0 X X 
1 goes X X 
2 to X X 
3 X X 
4 prior-to X X 
5 l ., X X 
6 slate X X 
7 5 6 
Table 4.6: Generation of Ao 
State Bt CD B1CD BICD AID EL 
0 0 0 0 0 0 
l 0 0 0 0 0 
2 0 0 0 1 1 
3 0 0 0 0 0 
4 0 1 1 0 0 
5 0 * 1 0 0 
6 0 0 0 0 0 
i 0 0 0 0 1 
98 
!'. 
~· 
l: 
)r 
~· 
'·· ~ 
'· 
; . 
. ~ . ' 
' · 
' . ~ · 
!~ 
.• 
;; 
't 
(·I.:J) 
80 is generated, whenever C or Al is there, therefore: 
(·I.. J ) 
Ao is generated depending on the switch state and various input.s. In the l'<tl'e 
of P Ei.i failure, Ao is generated by the switch Sit,j+I (il < i ), which i:-; either 
in state sr.r or in ST;'. When a switch, which is in S'J~V or 8'/~', rcceiVI!S A,, 
it generates Ao. Any Ao generated by Sit,j+l (il > i + 1), stays high till I+ 1, 
so that Ft stays high for bringing various switches to their prior-t.o-t,,-st.ale, if 
required. Various combinations for generating Ao are listed in 'J';Lhlc •1.(). Tlu! 
entry for Bl · C · D, corresponding to state 5 is marked a.s '*', hcca.use this eo1111it.iofl 
generates A0 , which stays high only until the rising edge of CLit's ilrrivcs (eVI!JI 
though the switch remains in ST[). 
Consequently, 
Ao = Aot + Ao2, 
Ao2 = BrCD(s2S1 so)+ BrCD(s2sd + Ar(s2St ·~o) + B,,(.~2·~t·'1o + .'12·~,.~o) · 
(1 .. e;) 
SV 1;,~;,,-1 is generated by sri and it is used by P l~i.i-t for selecting the proper 
vertical input port. If P Ei,j-l is using input port JY,Eo and input error iM dcl,(!r.led, 
SV If~;,,-1 is made high and I}:e1 is selected. Now, if S'i~ receives /J, it resets 
SV I:.E;,,-1 and again I };g0 is selected. When sr3· is in S1~v or in S1i and a liu k I,J 
99 
r--------------------------; 
I I 
I 
D .Br ~---1 
CLI<s > 
RR5'·' PB,_,,, I I 
~--------------------------~ 
Figure 1.16: Block Diagram of the Horizontal Switch (combined P E and link failure 
algorithm) 
failure signal arrives, Ao is generated, which moves towards Si,O! until it finds srjP 
which is not in STi. In this case, if SV If~;~~ is high, it means that P Ei,il is using 
rr.EI due to earlier link failure. In this case, during the final rerouting, sv /;.~:~~ 
is uot. reset.. 
Horizontal Switch - This also has two sub-circuits: control circuit and switch-
ing circuit. The block diagram of the control circuit is given in Figure 4.16. 
When Sf.~ receives !ll?.~'iL1 ,,, it toggles either from STf to STf1 or from ST[i 
t.o ST//. Similarly, when D · Br is '1 ', the t + 1 edge of C Lf(s toggles it from one 
state to the other. For this, D · Br is latched at the falling edge of C L/(5 and the 
latched signal is AN Dcd with CLI<s to generate the clock for the flip flop. 
The testing block of the switches tests the logic circuit of the switches and the 
F F signal is generated, when a faulty switch receives any reconfiguration request. 
The operation of the algorithm is shown in the next section. 
4.5 Operation of the Algorithm 
Consider the array, shown in Figure 4.17, which has no faults at t1, at which time 
P Ei,j detects a vertical input error. Immediately, P Ei J. generates LF~EP (which 
t •• , 
delays the next rising edge, t + I of C LK PE) and LF;~~~. When Si.i+I receives 
100 
CLJ(pE L 
CLKs u L_ 
A ~ A ~ A A A A 
tl t. + 1 l2 l6 l3 1-t t + 2 
Figure 4.17: Operation of the Algorithm (combined P E and link failul'e algorithm) 
E, it makes EL and X high. X is fed back to P Bi,j, which sclcctM IY,1:;1 in place of 
/~Eo and it resets E, because the second link is uon-faulty. 
Let us assume that at t2 , PEi-l,j fails and various /iRs arc generated. Si-1,;+1 
receives D, Si,j+l receives B1 & D and Sidtl (i < iz ::5 m+ I) receive /JJ,C' & D. 
Once Si,i+l receives D, it resets X and PEi,j is forced to select /f.r;u again. IJ,, C & f) 
change the inputs of J- [( flip flops and the new inputs arc listed in Tabh! ·1.7. 
At ts, the positive edge of C Ll\s arrives and the switches <:hauge thdr st.at.es 
depending on the J- /( inputs. These changes arc listed bdow: 
• sr_1,i+1 goes to ST.V and 
• S~,i+1 (i :5 ix :5 m +I) go to S1i. 
At t2, P Eis,i ( i-1 :5 i;r: :5 m) generate RR~'£~:·', which brings S!!,J ( i $ iz: $ m + I) 
to ST1H (required for the horizontal data intermediate stage routing). R!l?~•s+l , , I /,,. ,J 
101 
Table 4.7: J- /( flip flop inputs at l 2 
Flip-Flop Inputs at t 2 
Inputs s·v L • I '+I 
·- ,J 
sY.+I I,J SY '+I lr1) s~.j+l s~+l.i+l 
i < ir < m 
J/(2 0 1 1 1 1 
J /(1 0 1 1 1 0 
JKo 1 1 1 0 1 
makes P Eis+l.i select the other horizontal input port pair (here it selects 1'Ij!E2 and 
I ll ") /'BJ • 
Since the earlier vertical input error was detected by P Ei,j, it reappears again at 
t3 (after the re-routed data. arc checked). There are two possibilities of its detection 
and t.hcy arc written next. (if the earlier line failure was due to switching circuit 
failure, it would not appear during the intermediate stage because the switch has 
changed state and a different path is in use). 
Case A - If the first vertical input error was due to the failure of link L~;s-:.!.1 ., , 
then P Ei+t.i would now detect a vertical input error at t 3 • In this case, at t3, 
P Bi+t,i generates LFpck+ (which delays the next C L](PE and G LI<s edges) and 
I J,J 
LF~~~~!~1 • Once Si+t,j+l receives E, it latches it as EL and generates X, which 
makes p Ei+t,j select IY,Et in place of lfoEo· At the same time, si+I,j+l generates 
Ao, which is passed on to Si+l.h which in turn generates Bo and feeds it to Si,i· 
Ar and Br change the inputs o£ J - [(flip flops of S,V+I,j and SlJ respectively. The 
new inputs arc listed in Table 4.8. These A1 and Br appear as C Lf(FF after a 
delay (which ensures the presence of proper information at the flip flop inputs) and 
it changes the states of Sl'~ 1J and Sij to STJ and STl respectively. The previous 
states get latched in the D-flip flops. (Ar + Bl) gets latched as FL at the falling 
edge of C Ll\s (at t.t). Here, (t\1 + Br) would be high at t4, because EL gets reset 
at t + 2 and Ao (of Si+t.j+.) is high at t4 • 
102 
Table ·LS: .1- H llip flop inputs due to A, anti 131 
Flip-Flop Inputs due to A, ctnd Ill 
Inputs \' s .. + •. J •\' ::;,,j 
J /\'2 I 0 
.JK, 0 I 
J /\o 0 I 
Table '1.!): .J- /\' flip llop inputs a.t l + 2 
Flip-Flop Inputs at l + 2 
··r \' s-r ~i-l,j+l S· '+I I,J '••J+I 
i < ir < HI+ I 
J /{2 0 I 1 
J g, 0 0 1 
J Ho 0 0 I 
--
Case B ~ Jf the first vertical input error occurred due t.o the failctl link IJ51:1~· •• , , 
'·I+ I 
then P Ei,j will report vertical input error again and the same changes occur, whic:ft 
are explained earller in Case A, but here A1 + B1 would not stay a.~ F,. t.illl + 2, 
because Et gets reset at t4 and it resets Ao {of Si,i+d· 
At t~, the J- /( inputs of the flip flops change again due lo a cha.nge in l.lu! 
switch state. The new inputs arc listed in Tahle tl.9. At t + 2, the next dock c~tlge uf 
CI,K s appears and it completes the final stage of rcconfiguration hy changing .S'ti+ 1 
to ST;' and Si~,i+l (i < i: $ m t I) to S'l';'. At the same time, if t.hc previously 
explained Case-A is valid, sr+ I ,j and srj arc brought bade lo S1;Y (because! Fi. is 
high at tt 2, which enables the set- reset tri-statcs and these asynchronous inputs 
of the flip flops load the previous states in these flip flops) and X (of s.+J,jtl) iH 
reset by Ao · C LKs. There is no change in t.hc states of Si~ 1 ,1 and Si~ for Case- B 
103 
aL t + 2. Clock edge t + 2 changes Sl!J+l (i- 1 < ir < rn + 1) to ST[I now and it 
completes the tota.l reconfiguration. 
4.6 Concluding Remarks 
In this chapter an on-line reconfiguration scheme for P E a.nd link failures was 
discussed. Here an extra row of cells (called spares) is provided to the array and 
in the case of a detected P E failure global shift is performed for the corresponding 
column. The links a.re duplicated to provide link redundancy and link failures 
are detected by checking parity bits. The redundant vertical link is taken through 
different data path than the original link because in this configuration, the complete 
failure of a RWitch block will have lesser effect on the overall reliability. When a 
horizontal ]ink fails, the P E automatically selects the other horizontal input port 
and when a vertical link fails, the P E informs the neighboring switch about this 
failure. The neighboring switch invokes the switch state changes and commands 
the P E to select the other vertical input port. 
The control circuit for the PEs and switches were designed and the network was 
modified to support the algorithm. It was proved that the proposed eight states 
of the vertical switches and two states of the horizontal switches are sufficient to 
support the algorithm. 
Here it is assumed that the link failures are detected by the PEs by using parity 
bit checks. The number of parity bits can be chosen depending upon the reliability 
requirement. With one parity bit only odd number o£ bit errors can be detected. 
The algorithm is evaluated in the next chapter. 
104 
Chapter 5 
Algorithm Evaluation 
The reconfiguration algorithms are evaluated based on the rollowing criteria (19): 
t probability of suroival- defined as the probability of correct rcconfigura.tions 
in the presence of x faults, x::; S, where S is the number of spare cells in the 
array, 
t locality of interconnections, 
• time ct)mple:ity of T'f.configuration algorithm and 
t area complexity of the switching and routing circuits. 
These features are conflicting. It is possible to develop an algorithm which IM 
simple a.nd maintains high locality, but the probability of survival degrades in this 
case for an increasing number of faults. 
The proposed algorithm maintains high locality by allowing only one downward 
shift in the case of a failure. The algorithm introduces very small time delays when 
a fault occurs. It is assumed that the central processor provides the clock pulses to 
the PEs and switches. When aPE fails, the central processor rcduc.es the clock 
speed for next two clock periods. This can be achieved by simply blocking the on-
period of the clock, when the delay is required. If this method is used, the railurc 
of any P E would introduce a delay of 2 clock periods in the operation and a link 
failure would introduce a delay of 1 clock period in the operation. 
105 
The increase in the complexity of the switching circuit is not large for the only 
P E failure nlgoritltm, but the switches and the network arc slightly more complex 
for the combined P E a1lllliuk failu.T'e algorithm. 
The probability of survival is derived analytically first and then simulation rc· 
suits arc pr<!sented. 
5.1 Analytical Results 
We consider a 4 x 4 acti\'c array, which needs a physical array of size 5 x ,~, The array 
hiL'i 1 spare cells, 20 vertical active · .uks and 20 horizontal active links (including 
input and output link:.). 
The P E and link failures arc considered separately in the following subsections. 
5.1.1 Probability of Survival After a P E Failure 
The above mentioned array cannot tolerate more than four P E failures, because it 
hiL'I only four spares. As explained earlier, each column can tolerate only one faulty 
PE. 
One Failure - The probabHity of survival in this case is 100%, because the first 
fault is always tolerated. 
Two Failures - If the first fault is in column 0 and second fault occurs in one 
of the remaining columns, the array can tolerate these two faults and the total 
number of combinations for this occurrence is 5 x 15, because there are five PEs 
in column 0 and fifteen PEs in other columns. Similarly, if the first fault is in 
column 1 and second fault occurs in any one of the remaining columns, the array 
can tolerate these two faults. Since, the case of one fault in column 0 and the 
other fault in column 1 is iucludcd earlier (where column 0 has the first fault and 
column 1 has the second fault), the number of combinations for the occurrence of 
two rcconfigurablc faults (which arc tolerated by the algorithm), with first fault in 
106 
column 1 is 5 x 10, because there are five Pes in column I and ten P 8s in column 
2 and column 3. So, the total number of combinations for two n·cotlfigurahlt' faults 
can be written as: 
S ltCCCSS2 = 5 X 15 f 5 X 10 + 5 X fi . 
Total number of combinations for 2 faults is ( 22° ) . 
The probabilty of survival for two failures is: 
p. S llCCC.~.':12 75 + 50 + 2.1 Q ~g } ~ It/ 
, = ( 2~ ) = 190 = ,, ! 5 = o8.95,., ( !l. l ) 
Three Failures - The number of combinations for three rc<:onligmahiP faults can 
be written as: 
I. one fault in column 0, one in r:olurnn I and one fault citht!r iu c:olttnm 2 or in 
column 3 ; 5 x 5 x I 0 = 250, 
2. one fault in column 0, one fault in column 2 and one fault in column :J; .1 x!)x f) 
= 125 and 
3. one fault in column 1, one fault tn column 2 and one fault 111 colu11111 :1: 
5 X 5 X 5 =125. 
So, 
Succcss3 = 2.50 + 125 + 125 = 500. 
The probability of survival in the presence of three faults <:an be writt.en ilS: 
P 
_ Succcss3 _ .500 _ 0 ,. 386 _ ,.,1 86u1 3 
- ( ~0 ) - 1140 - ,., - ·•·•· fl), 
Four Failures - The number of combinations for four rcconfigurahl«! faull!l r.an 
be written as: 
107 
1. one fault in column 0, one fault in column 1, one fault in column 2 and one 
fault in column 3: 5 x 5 x 5 x 5 = 625. 
So, 
Success4 = 625. 
The probability of survival in the presence of four faults can be written as: 
P, = S(cce..)' = 625 = 0.1290 = 12.90%. 20 4845 
4 
(5.3) 
5.1.2 Probability of Survival After a Link Failure 
There are 20 vertical and 20 horizontal active links in the array. When an active 
link fails, it is replaced by the spare link and the spare is then called the active link. 
The algorithm checks only the active links. So for this calculation, only the active 
links are considered. The links are designated depending on their destination. For 
example, the link carrying the vertical input from the central processor to P Eo,o 
is named as VJink (0,0). Similarly, a link carrying the horizontal data from PE~0 
to PE~1 is named as H_link (0,1). So, the links can be taken as array elements 
and vertical link array ( V_/ink array) would be a 5 x 4 array with elements from 
(0, 0) through (4, 3). The bottom most row of elements represents the links, which 
connect the vertical output o£ the array to the central processor and since the 
switches, providing these links, are in srr and STJ' I it is assumed that the last 
row of links (array elements) can survive only one faulty link (element). All other 
array elements can survive one fault. Since in the event of a. link failure, the spare 
link replaces the faulty link and the spare is given the same index (making that 
clement of the array active again), each element of the array can fail twice. The 
second failure of any array element leads to fatal failure. Similarly, the horizontal 
links can be written as a 4 x 5 array (HJink am1y), where the elements of column 
108 
4 represent the links, connecting the horizontal output of the array to the central 
processor. Here, each array clement can survive one failure and the second failure 
of the same element leads to a fatal failure. 
One Fault - The probability of survival in this case is I 00%, because one f;mlt is 
always tolerated. 
Two fauits - When element (0,0) of the \!_link army fails first, the failure of <lily 
other element in \!_link array aud H_link array is tolerated, hnt the next. failure 
of element (0,0) of the VJink array leads to fatal failure. The munber of the 
combinations of two reconfigurablc link failures, with V_/ink(O,O) as the first. failure 
is: 
19 (remaining V_/ink ar-ray elements) + 20 (JLiink array clements) = :m 
and with (0,0) as the first failure, there arc 40 combinations of t.wo failures. 
Similarly, when element (0, 1) of VJink array fails ftrst, the number of combi-
nations of two reconfigurable faults would be 18+20=38. Failure of clement (0,0) 
is not included here as the second failure because this combination (failure of (0,0) 
and (0,1)), is already included in the first case (where (0,0) is the first failure). The 
number of possible combinations of two failures in this case is 39. 
So, the total number of rcconfigurablc two failures is: 
Success2 = :L 39 - L: 3 = 774, 
where E 3 is the number of combinations of two faults, with both faults ira the 
. bottom most row of V_/ink army (it is assumed that the bottom most row c:an 
tolerate only one fault). 
The total number of combinations for two link failures is :L 40 = 820. 
So the probability of survival in presence of two faulty links is: 
774 0'/ 
p2 = 820 = 0.9439 = 94.39 70. U>A) 
109 
Three Faults - The number of combinations for three reconfigurable faults can be 
derived as follows: 
as: 
I. when V_/ink (0,0} is <me of the faulty links, the number of combinations would 
be ( 329 ) - ( ~ ) = 73.5, where ( ~ ) is the number of combinations with 
two faults in the bottom most row of V_/ink array, 
2. when V_/ink (0, I) is one of the faulty links (but V_link (0, 0} is not faulty), 
the number of combinations would be ( 32
8 ) - ( ~ ) = 697 and so on. 
So, the number of combinations for three reconfigurahle faults can be written 
s""""3 = [ ( ~ ) + ( ;s ) + ( 3; ) + ... + ( ~ ) J - B and 
where B is t.he number of combinations of two or more faults in the bottom most 
row of V_/ink ar·my. 
The number of combinations for three faults is: 
Combinations3 = E40 + 2:39 + 2:38 + ... + 1. 
So, the probability of survival for three link failures is: 
p3 = Success3 = 9660 = 0.8415 = 84.15%. (5.5) 
Com&ination3 11480 
It can be observed that the analysis becomes increasingly complex as the number 
of failures increase. Therefore simulation is used to get the values of probability of 
survival for a greater number of faults. 
5.2 Analysis of Simulation Results 
As explained earlier, it is difficult to calculate the probability of survival, for large 
number of faults and large arrays using an analytical method. So a computer 
110 
program is written (the basic control flow of the program is given in Appendix A) 
to simulate the algorithm with a view to calculate the probability of sun·ival. 
5.2.1 Simulation Software Outline 
The simulation program injects the specified number of faults randomly and checks 
the outcome of the algorithm. For an example, when the program needs to inject 
one P E and one link failures, it generates a random number P ~~-L [ N 1\, whid1 
can be either '0' or '1 '. \Vhcn it is '0', a P E failure is inject.cd in the array. For 
this, the index value (i,j) is generated randomly and failure of PEi.i is injt~c:l.ed 
and reconfiguration algorithm is performed on the array. 
When P EJA N K is 'l ', a link failure is injected. Ilcrc, another r<111dom tllllll -
ber, H _y is gencrat.cd. When II _V is '0' ('1 '), horizontal (vertical) link failure is 
injected by randomly generating the index (i,j) of the link. This program docs not. 
assume that the bottom-most. row of the V_link army can tolerate ouly one fault. 
(as was assumed for the analytical calculation). Instead, here a. random number is 
generated, which provides the information about. the outcome of the a.lgorit.hrn for 
the failure of or link in presence of faulty 0~ (jl < j). When oj fa.ils in pn~S(~flC:(' 
of faulty Oft (jl > j), fatal failure occurs. The program simula.t.cs t.hc algoritl11n 
completely by changing the states of the switches, P B.~ and links and cheeking t.lw 
outcome. 
The program injects the specified number of faults for a Hpecifi(!d nurnlwr of 
times n (by going in the same loop) and every time it st.arls with a. fresh army 
(fault free array). 
Every time the program enters the loop (the process is called a tr·ial), it rct,ums 
one of the two possible outcomes, success, S or failure, F. An outcom(! of S' informs 
that the reconfiguration attempt was successful and F indicates the occurrence of 
a fatal failure. 
111 
5.2.2 Confidence Level of the Simulation 
In this simulation, the trials are independent of each other, because every trial 
starts with a fresh array and thus the probability of success remains constant from 
trial to trial. These trials are called Bernoulli trials and the random variable X , 
which denotes the number of successes in n trials has a binomial distribution given 
by p(x) and: 
p(x) = (:) .y:.(l-p)n-.2:, x = O,l,2, ... ,n 
= 0 otherwise, 
where, p is the probabilty of success of any random trial. 
The binomial distribution approaches the normal distribution in the limit as n 
becomes large. In general, the approximation is fairly good as long as n · p > 5 
and n · q > 5, where q = (1 - p). 
The probability density allows one to find the probability th~t the data would 
assume some value within a specified range at any time. A normal density function 
f(z) (shown in Figure 5.1) determines the shape of the plot. When the number of 
successful trials is X for n trials, the probability of success for a randomly selected 
trial can be estimated as: 
where p is called the estimate of p. 
X 
p = -;' (5.6) 
Now, two values Pl and Pl (which are functions of p) can be determined in such 
a way that the probability of plying between Pt and P'l is (1 -a). That is: 
Therefore (p11 p2) forms an interval, which has the probability (1- a) of capturing 
the true value of p. This interval is called the confidence interval and (1 - a) is 
called the confidence coefficient (confidence level) [22]. 
112 
·Za/2 0 
Figure 5.1: Normal Density Function 
The confidence interval for p can he written as: 
(fi.7) 
where p and q arc the estimated values of p and q and ::,l/'1. ·/¥- is the margin of 
error E in the estimated value. So, 
E < z.,,jf;!, (5.8) 
which gives Equation 5.9. 
¥() ,. -
( .., /2)2 n 2:: E . p. q. (!i.!)) 
J?or the simulation, the number of trials is calculated based on equat.iou fUJ. Tile 
maximum value of p · q is 0.25, when p = q = 0)). (f we want t.hc coulldcru:c 
level to be 95% and the half width of the confidence interval to he 2%, t.hcu 
n > (Za/2)2 0.2.5 
- 0.02 X • (!;.tO) 
For a confidence level of 95% (a = 0.05), Za/2 = 1.96 (from t.hc <:umulat.ive 
normal distribution table [22]). So, 
( 1.96) 2 n 2:: O.ol X 0.25 
( !l. I I ) 
~ 2401. 
Now, if the total number of trials is more than 2401, it can be said con fidcnt.ly 
that the probability of success in any random trial is p ± 2%, 19 times out of 20. 
113 
5.2.3 Probability of Failure 
The probability of survival goes down with the increasing number of faults but 
the probability of occurrence of a large number of faults also goes down. In any 
array as the failures can he reasonably assumed to be independent, the binomial 
distribution can he used to calculate the probability of occurrence of x failures (the 
active array size ism x n.). 
If we consider the P E failures, total number of PEs in the array is (m + 1) x n, 
so probability of x P B failures is, 
Pr,PI:J = ( (m: l).n ) · PPE · q~~+I).n-r, 
where PPB is the probabilty of failure for PEs. Here, the value of PPE would be very 
small therefore the Poisson distribution can be used to approximate the binomial 
distribution and then 
where ApE = (m + 1) · n · PPE· 
Similarly, the probability of x link failures is 
where Arink = ((m + 1) · n + m · (n + 1)] ·Plink· Here (m + 1) · n is the number of 
active vertical links and m · ( n + 1) is the number of active horizontal links. It is 
assumed that the probabilities of a horizontal link failure and a vertical link failure 
arc both equal to Plink· 
Since the occurrence of P E failures and link failures are independent of each 
other, the probability of x 1 P E failures and x2 link failures can be written as: 
114 
,. 
f 
! 
I 
~·. 
... 
' 
The probability of occurrence of various failures for a ·I x ·l array arc lislcti iu 
Table 5.1 {assumed PPE = 10-·l and Plink = 10-6 ). The probability of switch and 
link failures is less than that of PEs because the PI~ circuitry is more complt•x 
than that of switches and links in most casl's. 
5.2 .4 Simulation Results 
The results of the simulation program arc listed in Table 5.1 for various values of 11 
for a 4 x 4 array (the maximum number of injected P E faults is four and injected 
link faults is three). The first column in the table gives the number of f;udt.y 
PEs (i) and the second column gives the number of faulty links (j). Tht! joint. 
probability of i P E failures and j link failures is listed in column 3. The cslimiltcd 
probability of survival is listed in the other columns for various values of u. It can 
. ' 
be seen that the estimated value of p becomes stable, once n hccomcs la.rgc. The 
complete table of outputs (for n = 3000, array size = '1 x 4, maximum nutnhcr of 
P E faults = 4 and maximum nurnhcr of link faults = 7 ) is given in Table 5.2 a1111 
various confidence intervals arc calculated and listed in the same table (for !}!)'i{, 
confidence level). The fil'st two columns of the tahle give the number of J> I~ and 
link faults. The estimated probability of survival (p, output of the simulation) is 
given in column 3. The confidence-interval is calculated based on p and lisl.t!d in 
column 4. 
It can be seen that the analytically calculated values of probability of survival 
are well within the confidence interval (calculated from the sirnulat.iou result:~) for 
P E failures but they arc below the confidence interval for link failures. It is because 
of the assumption, that the bottom most row of V_link array can snrviwJ only oue 
fault, which was made for t.he analytical calculation. 
The overall probability of survival for a 4 x 4 array is calculated hascd on 
tables 5.1 and 5.2 and it is 99.903%. 
115 
T;Lblc .5.1: Estimated Values of Probabilities of Survival (Array Size=4 x 4) 
Number Number Probability Estimated value, p (%) 
of PB of link of this 
Faults faults occurrence n=lO n=100 n=lOOO n=2100 n=5000 
(i) (j) * 
0 9.97 X 10- 1 100.00 100.00 100.00 100.00 100.00 
0 1 3.!J9 x w-5 100.00 100.00 100.00 100.00 100.00 
2 7.~18 x w-w 100.00 92.00 95.30 9.5.00 95.82 
3 1.o6 x w- 14 50.00 87.00 88.80 88.76 88.12 
0 1.99 X 10-3 100.0{) 100.00 100.00 100.00 100.00 
l 1 7.98 X IQ-8 100.00 100.00 99.70 99.43 99.50 
2 1.60 X lQ-12 90.00 95.00 9·1.10 95.76 95.38 
3 2.13 x w- 17 90.00 86.00 87.80 87.42 87.90 
0 1.99 X 10-6 80.00 77.00 77.10 76.86 78.60 
2 1 7.98 x to-u 40.00 78.00 78.40 79.19 78.18 
2 1.60 X lQ-IS 90.00 72.00 71.00 75.43 74.54 
3 2.12 x w-:lU 80.00 71.00 69.70 69.62 67.48 
0 1.33 X 10 9 40.00 49.00 43.40 4.5.00 44.78 
3 1 5.32 x w-Jol 40.00 40.00 41.80 42.43 43.24 
2 1.06 x w-ts 60.00 36.00 41.50 40.81 41.02 
3 1 A2 x w-23 40.00 41.00 39.90 35.71 37.40 
0 6.65 x w-13 0.00 22.00 14.90 12.71 12.82 
4 1 2.66 x w-n 0.00 11.00 12.20 12.90 12.32 
2 5.32 x w-:.~:.~ 20.00 14.00 12.00 9.86 12.32 
3 1.10 x w-'l·· 30.00 10.00 11.50 1l.i6 10.70 
116 
Table 5.2: Estimated Values of Probabilities of Survival (Array Si:~.t•=·l x ·l) 
Number Number Estimated Confidence lutcrval 
of PE of link Probability, p n==:JOOO 
Faults Faults (from simulation) (Confidence level = !)5%) 
0 100.00 100.00 - 100.00 
1 100.00 100.00 - 1 00.00 
') 95.i7 95.05- 96. 19 
0 a 87.50 86.32- 88.68 
4 ii.30 75.80 - 78.80 
?) 65.80 6•L10- 67.50 
6 55.13 53.35- 56.91 
7 43.47 41.69 - ·l5.24 
0 100.00 100.00 - 100.00 
1 !l9. ~J3 99.04 - 99.62 
2 !J?l.li ~HAO -- 95.U:J 
1 a 88..10 87.25- 89.55 
·I 77.30 75.80 - 78.80 
5 6·1.67 62.96- 66.38 
6 5·1.80 5:to2 - 56.58 
7 111.73 39.97- ·I:L50 
0 i9.li 77.71 - 80.62 
1 77.77 76.28- 79.25 
2 74.50 72.94- 76.06 
2 3 6U.OO 67.34 - 70.66 
4 61.67 59.93- 63.41 
5 .s i.:n 1 9.58- 5a.t6 
6 '13.00 42.12- 45.()8 
7 35.27 33.56- :16.!)8 
0 :t3.rl7 41.69- 45.24 
1 13.93 42.16- 45.71 
1 42.47 40.70- 44.24 
3 3 :n.8:J 36.10 - :39.57 
4 :33.47 31.78- 35.16 
5 28.50 26.88- 30.12 
6 211.-57 23.03- 26.11 
7 20 .. 53 19.09- 21 .98 
0 12.07 10.90- 13.23 
1 12.!JO 11.70- 14.10 
2 15.20 13.92 - 16.48 
4 3 lOAD 9.31 - 11.49 
4 9.80 8.74- 10.86 
5 8.03 7.06 - 9.01 
6 7.27 6.34- 8.20 
7 6.13 .5.27- 6.99 
IU 
Various probabilities of survival for difft!rent array sizes are listed in Appendix B 
for 95% confidence level. 
5.3 Cgncluding Remarks 
In this chapter, the proposed algorithm was evaluated. It was shown that thi:? 
algorithm introduces a delay of two clock periods for P E failures and of one clock 
period for link failures. Therefore it can be inferred that the time overhead is very 
small. 
The locality of interconnections is maintained here by using global deformation. 
The &mount of increase in the hardware is very small for the only P E failure 
algorithm but it is slightly more for the combined P E and link failure algorithm 
due to complex vertical switch control circuit. 
It can be seen that though the probability of survival is less for large number 
of faults, the probability of this occurrence is also low. The overall probability of 
survival of this algorithm for a 4 x 4 array is 99.903% (assumed PPE = w-• and 
Plinlc = 10-6). 
118 
Chapter 6 
CONCLUSIONS 
The processing speed of a. computation ca.n be increased by ensuring multiple com~ 
putation per memory access. Systolic arrays accomplish this and in addition these 
arrays provide modularity and regular data flow. 
To improve the yield and reliability, various fault detection and reconfiguration 
schemes are used. In Chapter 1, the concept of systolic arrays was explained and 
various existing reconfiguration algorithms were dtscussed. It can be seen that 
most of the existing schemes are efficient for improving the production time yield 
but they are not suitable for run~time reliability improvement because they need 
an external processor to run the algorithm. In addition these schemes assume the 
network to be always fault-free, which is difficult to achieve. The scheme, proposed 
here, does not assume a perfect switching network and it is capable of tolerating 
the link failures also. 
The scheme proposed in this report can be used efficiently for on~ line reconfigu-
ration to improve run-time reliability. The algorithm for P E failures was presented 
in Chapter 2. A bottom row of spares is provided to the array and in the case of a 
P E failure, a global shift is performed, if the spare cell (for the particular column) 
is available. The P E3 are of a self-testing type and in the event of a fault detection, 
PEs invoke the reconfiguration by generating the reconfiguration requests. 
An algorithm for P E and link failures was presented in Chapter 3. Here, each 
119 
link is duplicated and a bottom row of spare cella is provided to the array. In the 
case of a P E failure a global shift is performed if the spare cell (for the particular 
column) is available. The link failures are detected by using parity bits, the PEs 
perform parity checks on incoming data and any error in the incoming data is 
taken u the incoming link failure. In the event of a horizontal link failure, the 
processing element simply selects the second input port, if it is using the first 
input port. If the P E is using the second input port and it detects an input data 
error, a fatal failure occurs. In the case of vertical link failure, the PE invokes a 
reconfiguration by generating a reconfigura.tion request. Various states were defined 
for the switches and it was proved tuat the proposed number of switch states is 
sufficient to tmplemcnt the algorithm. 
A central processor is linked to the array for providing the inputs and receiving 
the outputs. The central processor controls the clock input of the array and when 
a fault occurs, the central processor inserts delays in the clock as required by the 
reconfigura.tion algorithm. This algorithm makes full use of non·faulty partial re· 
suits after the occurrence of a fault and it does not require flushing of the array 
every time a fault occurs. 
The probability of survival for this algorithm was calculated analytically in 
Chapter 4. Next, the simulation results were presented. The simulation program 
injects random faults in the array and checks the outcome of the algorithm. The 
probabilities of survival were estimated based on the outcome of the random fault 
injection a.nci a 95% confidence interval was defined for each estimated value. The 
number of trials was calculated based on a maximum margin of error of 2% and 
on the required value of the confidence level (which is assumed to be 95% here) . 
The simulation results were analyzed in Chapter 4 and it was shown that the 
overall probability of survival is approximately 99.903% for a 4 x 4 array (assumed 
probability of P E failure = 10-4 and probability of line failure = 10-6). 
120 
It was shown that. the pnlbability of sun·i,·al after a fault ocrnrr~nr~ d~n~a:;cs 
with the increasing number of faults but. it is overshadowed hy the fart. t.hat tlw 
probability of occurrence of faults alsu dccrcasl's with inrrea:;ing !lltmlwr of faults. 
In the next chapter some suggestions for further n~l'('arrh are giwn. 
121 
I 
( ' 
,' 
... 
., 
,. 
Bibliography 
[1) A. Huang, "Architectural Considerations Involved in the Dcsigu of an Optical 
Signal Computer," Proc. of the IEEE, vol. 72, no. 7, pp. 780-78(i, July I !)8·1. 
[2] H.T. Kung, "\Vhy Systolic Architectures?," Computer, pp. :n-•tfi, Jan-
uary 1982. 
[3) P.O. Dianne, "Systolic Arrays for Matrix Transpose and other H.cordcrings," 
IBEE Transactions on Computers, vol. c-36, no. 1, pp. 117- 122, .January I !)87. 
[4) R.B. Urquhart and 0.\Vood, "Systolic Matrix and Vector Mulliplication Mt!lh-
ods for Signal Processing," lEE Proceedings, vol. 1:11, pl. F, no. (i, pp. ()2:!-G:Jl, 
October 1984. 
[5) J.C. Ward, J.V. McCanny and J.G. McWhirter, "Bit Level Systolic Array 
Implementation of the Winograd Fourier Transform Algorithm," I Bl:./ Pmr.r.r:d-
ings, vol. 132, pt. F, no. 6, pp. 473-479, October 1985. 
[6) K.H. Huang and J.A. Abraham, "Algorithm-Based Fault, Tolerance for Matrix 
Operations," IEEE Trans. Computers, vol. C-3:J, no. 6, pp. 518-528, ,June I !J81. 
[7) J.H. Patel and L.Y. Fung, "Concurrent Error Detection in ALU's by Re-
computing with Shifted Operands," IEEE 1'rans Com]Juters, vol. C-:i}, No. 7, 
pp. 589-595, July 1982. 
(8] J .H. Patel and L.Y. Fung, "Concurrent Error Detection in Multiply and Divide 
Arrays," IEEE Trans Computers, vol. C-32, no. 4, pp. tJ 17-422, April 198:1. 
123 
Chapter 7 
SUGGESTIONS FOR 
FURTHER RESEARCH 
In this report an algorithm for on-line reconfiguration was presented. This al-
gorithm can be extended in various directions depending upon the requirements. 
Some of the extensions are listed below. 
• The proposed algorithm uses only one row of spare cells therefore it would 
not be very effective for high failure rate of PEs. For such cases a greater 
number of spare cells is required. An additional column of spares can be 
added and the algorithm can be modified to make effective use of column and 
row spares. 
• When a spare column and a spare row of PEs are used, each P E requires 3 ~~r 
more static coefficient latches (the exact number depends on the algorithm). 
This requirement can be reduced by modifying the algorithm, so that only 
two copies of each static coefficient are kept in the array. When a P E fails, 
the static coefficients can be moved from one cell to the other (if required). 
This requires some additional time for reconfiguration but the hardware is 
reduced. 
122 
[9] S.· W. Chan, S.S. Leung and C.-L. Way, "Systematic Design Strategy for Con-
current Error Diagnosable Iterative Logic Arrays," fEE Proceedings, pt. E, 
vol. 1:15, no. 2, pp. 87-!H, March 1988. 
[I OJ A. Majumdar~ C.S. Raghvendra and M.A. Breuer, "Fault Tolerance in Linear 
Systolic Array using Time Redundancy," IEEE Trans. Computers, vol. 39, 
no. 2, pp. 269-276, February 1990. 
[II] R.K. Gulati and S.M. Reddy, "Concurrent Error Detection in VLSI Array 
Structures," JCCD I 986, pp. 188-491. 
p 2] R .. J. Cosentino, "Concurrent Error Correction in Systolic Architecture," IEEE 
Tmns. CAD, vol. 7, no. I, pp. 117-125, .January 1988. 
(13) H.M. Lea and 11.5. Bolouri, "Fault Tolerance : Step Towards WSI," lEE pro-
ceedings, pt. E, vol. 135, no. 6, pp. 289-297, November 1988. 
(1,1) H.. Ncgrini, M. Sami and R. Stefanelli, "Fault Tolerance Techniques for Array 
Structures Used in Supercomputing," Computer, vol. 19, no. 2, pp. 78-87, 
Fcb!'uary 1986. 
(15] M. Chean and .J.A.B. Fortes, "A Taxonomy of Reconfiguration Techniques 
for Fault-Tolerant Processor Arrays," Computer, vol. 23, no. l, pp. 55-69, 
January 1990. 
[16] C.W.II. Lam, H.F. Li and R . .Jayakumar, "A Study of Two Approaches for 
Reconfiguring FaulL-Tolerant Systolic Arrays," IEEE trans. Computers, vol. C-
38, no. 6, pp. 833-844, June 1989. 
11 i] S. Y. Kuo and \V.K. Fuchs, ''Efficient Spare Allocation for Reconfigurable Ar-
rays," IEBE 1Jesigr1 and Test, vol. 4, no. 1, pp. 24-31, February 1987. 
124 
[18] A.L. Rosenberg, ''The Diogcncs Approach to T<'stablc fallll-Tolcra.nt Arrays 
of Processors,'' IEEE Trans. Computel's, vol. C-32, no. 10, pp. tl02·910, Ort.o· 
ber 1983. 
[19] F. Lombardi, .M. Sami a.nd R. St.cfanclli, "Hcconfiguration of VLSI a.rmys hy 
Covering," IEEE trans, CA IJ., vol. 8, no. B, pp. 952-964, September 1 !JH6. 
[20) A.D. Singh, "Interstitial redundancy : An Area Efficient. Fault. Tolcra.nce 
Scheme for Large Area VLSI l?roccssor Arrays," I EBJ.; 'lhws. Comprtlt!r',.;, 
vol. C-37, no. 11, pp. la98-1410, November 1988. 
[21] L. Snyder, "Introduction to the Configurablc Highly Parallel Computer,'' JB/~1~' 
Computer, vol. 15, no. 1, pp. 47-56, .January H)82. 
[22] W. W. Hines, Probability and Sl.atist'ics in Engineering and MantlfJt:mr.nl Sci-
ence, John \Vilcy & Sons, pp. 240-256, 1980. 
125 
Appendix A 
Program Structure 
This program calculates the probability of survival by injecting random faults in 
f.hc array and running the reconfiguration algorithm. 
The ha.sic flow of controls is given below: 
Start. : 
Fault : 
Trial: 
Loop: 
PK.fault : 
PE_algorithm : 
get the array dimensions from the terminal; 
get the number of trials from the terminal; 
get the maximum number of faults, (maxfaulLPE, maxfaultJink) 
to be injected; 
initialize various variables; 
start with maximum number of PE faults, faulLPE to be 0 and 
maximum number of link faults, faultJink to be 1. 
generate arrays; /* one array each for PEs, vertical 
switches, horizontal switches, 
vertical links and horizontal links * / 
decide randomly, which fault (P E or link) to inject; 
if link fa.ult has to be injected, go to Link-fault; 
generate the index (i,j) of the failed P E randomly; 
if P Ei,j is non-faulty then go to PE_algorithm; 
go to PE_fault; 
check the success of the algorithm; 
if algorithm is successful, go to SuccessfuLPE; 
increment the number of attempts for (faulLPE, faultJink); 
126 
SuccessfuLPE : 
Linklault: 
HorizontaiJink : 
VerticaLlink : 
SuccessfulJink : 
NexLtrial: 
Success : 
go to Next-trial; 
modify the P E. switch and link arrays; 
if number of P B and link faults, already injected 
= ( faulLPE, fault-link), go t.o :mcce~s; 
increment the number of injected P E faults; 
go to loop; 
decide randomly, which failure (horizontal link or wrt.iml 
link) to inject; 
if vertical link failure has to be injected, go to VcrtiraLlink; 
generate the index {i,j) of the failed link randomly; 
check the success of the algorithm; 
if algorithm is successful. go to SucccssfuLiink; 
increment the number of attempts for (faulLPE, fauiUink); 
go to NexUrial; 
generate the index (i,j) of the failed link randomly; 
check the success of the algorithm; 
if algorithm is successful, go to SucccssfuLiink; 
increment the number of attempts for (faulLPE, faulLlink) ; 
go to N exUrial; 
modify the corresponding switch and link army; 
if number of P E and link faults, already injedcd 
= (faulLP E, faulLlink), go t.o Success; 
increment the number of injected link faults; 
go to Loop; 
reset the number of faults, already injected; 
increment the number of trials, already attempted; 
go to Trial; 
increment the number of successes recorded for ( fauiLP E, faulLiiu k ); 
increment the number of attempts for (faulLPE, faultJink); 
increment the number of trials, already attempted; 
if number of trials, already attempted is less than mwr 
127 
Next: 
lncrcrncnLiink : 
RescUrial : 
l~nd: 
specified number of trials, go to Trial; 
if (fauiLP E, faultJink) is less than 
(maxfault_PE, maxfaultJink), go to Next; 
calculate estimated probability of survival for each 
combination of (faulLPE, faulL.link); 
go to End; 
If faultJink is less than maxfaultJink, go to incremcntJink; 
increment faulLPE; 
reset fauiLlink; 
go to Reset-trial; 
increment fau!Liink; 
reset number of trials, already attempted; 
reset number of faults, already injected; 
go to Trial; 
Stop. 
128 
Appendix B 
Probability of Survival 
Here the simulation results are listed for various array sizes. The first two columus 
of the table show the number of P E and link faults corresponding to the particu-
lar row. The other columns give the 95% confidence interval of the probability of 
survival (calculated from the simulation results) for different array sizes (number 
of trials = 3000). 
Number Number Estimated Probability Confidence Interval 
of PE of link {Confidence lc11el = !l5%) 
Faults Faults 5 X 5 6 X 6 10 X 10 20 X ~0 
0 0 100.00 - 100.00 100.00 - 100.00 100.00 - 100.00 100.00- 100.00 
0 1 100.00 - 100.00 100.00 - 100.00 100.00 - 100.00 100.00 ·- 100.00 
0 2 96.35 - 97.58 97.57 - 98.56 9!}.04 - 99.62 !)9.69- 99.98 
0 3 91.28 - 93.19 93.26 - 94.91 !)7.28 - !)8.32 !)8.88 - H!).52 
0 4 82.86 - 85.4 7 87.60 - 89.86 !)4.:J:J .. !J5.87 !}8.:17 - 99. 16 
0 5 74.03 - 77.10 79.63 - 82.14 !)UJI - !J3. 76 !J7 .88 - !)8. 79 
0 6 61.58 - 65.02 71.45 - 74.62 88.96 - !H . II !)5.37 - !)6. 76 
0 7 53.15 - 56.71 6:1.60 - 67.00 H:l.:J8 - 81).!Jf.i !H>.5H - !J6.m; 
0 8 41.33 - 44.87 60.8:1 - 64 .30 79. 7:, - 82.5:1 86.:12 - 88.W-! 
0 9 31.65 - 35.02 58.78 - 62.28 75.2:l - 78.2!) 94.36 - !J5.!JO 
0 10 24.56 - 27.71 48.28 - .1)1.86 70.22 - 7:1.44 !J:J. 9:J - !l5. !j;j 
0 11 25.67 - 28.86 29.48 - :12.79 64.81 - (i8.22 82.00 - 84 .G7 
0 12 21.20 - 24.20 21.07 - 24.06 !)8.:JS - Gl.8S 86.32 - 88.68 
0 13 13.02 - 15.52 15.33 - 18.00 .11.85 - TJ5A2 85.4.5 - 87.88 
0 14 8.48 - 10.58 12.12 - 14.5.5 48.84 - 52.42 71.75 - 74.92 
0 15 8.48 - 10.58 8.93- 11.07 :m.t 1 - 42.6!J 69.85 - 7:1.08 
129 
Number Number Estimated Probability Confidence Interval 
of PE of link (Confidence level = 95%) 
Faults Faulls ,) X,) 6 X 6 10 X 10 20 X 20 
1 0 1 00.00 - 100.00 100.00 - 100.00 100.00 - 100.00 100.00 - 100.00 
1 1 99.08 - 99.65 99.55 - 99.92 99.46 - 99.87 99.64 - 99.96 
1 2 96.72 - 97.88 96.79 - 97.94 98.41 - 99.19 99.42 - 99.85 
I 3 90.33 - 92.34 92.76 - 94.51 96.57 - 97.76 98.84 - 99.49 
1 4 81.72 - 84.41 87.5 7 - 89.83 94.11 - 95.69 98.14 - 98.99 
1 5 73.38 - 76.48 80.18 - 82.95 91.42 - 93.32 97.61 - 98.59 
1 6 64.41 - 67.79 72.46 - 75.60 87.95 - 90.18 96.28 - 97.52' 
1 7 53.35 - 56.91 65.72 - 69.08 84.38 - 86.89 94.11 - 95.69 
1 8 43.29 - 46.85 55.60 - 59.14 79.36 - 82.18 93.47 - 95.13 
1 9 33.19 - 36.61 46.61 - 50.19 74.37 - 77.43 92.19 - 94.01 
1 10 25.54 - 28.72 39.94 - 43.46 69.82 - 73.05 90.01 - 92.06 
1 11 17.50 - 20.30 31.55 - 34.92 64.57 - 67.96 88.82 - 90.98 
1 12 13.18 - 15.69 24.27 - 27.40 57.58 - 61.09 85.31 - 87.75 
1 13 9.94 - 12.19 21.10 - 24.10 54.99 - 58.54 84.79 - 87.27 
1 14 5.81 - 7.59 15.88 - 18.58 47.08 - 50.66 78.57 - 81.43 
1 15 4.16 - 5.71 10.55 - 12.85 43.52 - 47.08 79.94 - 82.73 
2 0 82.83 - 85.44 85.52 - 87.95 89.66 - 91.7 4 95.15 - 96.58 
2 1 80.08 - 82.86 82.96 - 85.57 89.21 - 91.33 93.76 - 9.1) ·F 
2 2 77.54 - 80.46 81.14 - 83.86 88.33 - 90.53 93.22- 94.: 
2 3 73.69 - 76.78 78.16 - 81.04 88.40 - 90.60 93.26 - 94.9·t 
2 4 69.14 - 72.39 72.91 - 76.03 84.17 - 86.70 92.76 - 94.51 
2 .5 60.67 - 64.13 68.06 - 71.34 83.03 - 85.63 91.45 - 93.35 
2 6 53.59 - 57.15 63.29 - 66.71 79.08 - 81.92 90.85 - 92.81 
2 7 44.62 - 48.18 56.03 - 59.57 75.66 - 78.67 90.75- 92.72 
2 8 37.52 - 41.01 47.31 - 50.89 71.72 - 74.88 87.98 - 90.22 
2 9 27.67 - 30.93 42.95 - 46.51 68.02 - 71.31 87.74 - 89.99 
2 10 22.67 - 25.73 33.19 - 36.61 63.16 - 66.57 83.52 - 86.08 
2 11 17.27 - 20.06 26.20 - 29.40 59.15 - 62.65 83.03 - 85.63 
2 12 13.14 - 15.66 20.65 - 23.62 52.95 - 56.51 81.48 - 84.18 
2 13 8.96- 11.11 17.27 - 20.06 49.41 - 52.99 80.63 - 83.37 
2 14 4.65 - 6.28 13.02 - 15.52 45.91 - 49.49 77.47 - 80.39 
2 15 3.24 - 4.63 9.98 - 12.22 39.7 4 - 43.26 75.36 - 78.38 
3 0 50.'18 - 54.05 59.15 - 62.65 72.91 - 76.03 86.18 - 88.56 
3 1 49.61 - 53.19 54.89 - 58.44 72.46 - 75.60 84.62 - 87.11 
3 2 48.58 - 52.16 56.33 - 59.87 70.19 - 73.41 84.03 - 86.57 
3 3 47.21 - 50.i9 52.78 - 56.35 69.65 - 72.89 83.03 - 85.63 
3 4 41.43 - 44.97 50.28 - 53.85 67.89 - 71.18 83.41 - 85.99 
3 5 39.50 - 4~.03 47.28 - 50.86 65.62 - 68.98 82.00 - 84.67 
3 6 34.68 - 38.12 42.29 - 45.84 64.03 - 67.43 79.49 - 82.31 
3 7 28.39 - 31.67 39.90 - 43.43 60.40 - 63.87 80.08 - 82.86 
3 8 24.33 - 27.47 31.98 - 35.36 57.41 - 60.93 79.60 - 82.40 
130 
Number Number Estimated Probability Confidence lnten·a l 
of PE of link (Confidence level = 95%) 
Faults Faults 5x5 6x6 10 X 10 20 X 20 
3 9 19.48 - 2~.39 27 .so - 31.06 53.96-57.51 iS.60 - 81.-lti 
3 10 15.56- 18.24 23.74 - 26.86 50.68- 5-l.~5 77.95 -· 80.85 
3 ll 11.03- 13.37 19.90- 22.83 49.95 - 53.52 7·l.71 - 77.76 
3 12 8.61 - 10.72 15.04 - 17.69 43.85- 47..12 i ·L3·t - 7i ..to 
3 13 5.31 - 7.03 14.01 - 16.59 42.56 -· 46.1 1 72.2!} - 7.1.·1·1 
3 14 4 .65- 6.28 10.33- 12.61 36.73 - ·10.2 1 70.1:.!- 7:1.:1-1 
3 15 2.81 . 4.12 7.28- 9.25 :12.93- 36.;J.t 67 .()8 - 70.98 
·l 0 21.01 - ~3.99 30.76 - 3-l.ll 52.42 - 55.98 71.95- 75.11 
4 1 21..19- 24.51 29.67- 32.99 50.85- 54 ..l2 70JH - 7-t .1 a 
·l 2 22.28- 25.32 28.46- 31.74 47.98- 51.5() 70.70 -- i:UHl 
4 3 21.62- 24.64 26.98 - 30.22 50.91- MA!) (i9.!J5 -· 7:L 18 
4 4 18.34- 21.19 27.02- 30.25 4 7.98 - 51 .5G ()8.22 . 71.51 
4 5 16.82- 19.58 21.79- 27.91 <16.18 - 50.05 6!). 78 -- 7:1.02 
4 6 14.0·l- 16.62 21.85 - 21.88 45.-11 - 1 8.!)!) 68.06 -- 7I.:H 
4 ., 11.06- 13..11 18.31 - 21.16 42.92 - 116.48 67.68 - 70.98 
4 8 11.22 - 13.58 15.82- 18.52 43.22 - 46.78 66.1 6 - G9.fi0 
4 9 10.71 - 13.02 16.20- 18.93 :J6.06- 39.5·1 6·1.61 - (i7 .9H 
4 10 6.21 - 8.05 12.82- U5 .31 36.46 - 39.!}4 65.89 - (i!}.2-l 
4 1 J 4.99- 6.67 10.49- 12.78 33.56 - :J6.H8 fi:I.3H - (i6 .81 
4 12 3.09- 11.45 8.55- 10.65 29.90- :J:J.2:1 61.71 - fifi . ]() 
·1 13 1.88- 2.98 7.19-9.15 27 .9:J - :11 .20 {)2 . 72 - 66. 1;, 
·l 14 2.06 - :3.21 5.09- 6.78 27.57- 30.8:1 58.72 - ()2.22 
4 15 1.15- 2.0~ 3.97- 5.49 23.12-26.21 !i7 .. ll~ - r. 1 .on 
5 I 0 4.65- 6.28 10.61 - 12.92 :31.51 - :14.8!} .58.8.1 .. fi2.:15 
5 1 4 .68- 6 .32 9.63- 11.84 30.17- 33})0 .16.'10 - !)!).na 
5 2 4.19- 5.74 10.71 - 13.02 29.80- 33.] :J 56.:10 - !iH.8:J 
5 3 3.76 - 5.24 9.66 - 11 .88 28.6!} - :Jl .H8 57.0 I -· fi0.5:J 
5 4 3.60- 5.06 9.85- 12.08 28.79 - 32.08 56.54 - fiO.Ofi 
5 5 3.27- 11.67 7.31 - 9.29 27.41 - :J0.6fi r,~un - !H A 8 
5 6 3.30 - '1. 70 7.85 - 9.88 27.11 - ao.a5 55.60 - !;!}.14 
5 7 2.75 - ·l.05 7.22 - 9.18 24..10-27.54 !)4.2!) - .17.8-1 
5 8 2.09 - 3.24 6.93 - 8.87 24.76 - 27 .fJI 52.72 - 5(i.28 
5 9 1.62- 2.65 .5.62 - 7.38 22.41 - 25.46 54.7:4 -- lj8.27 
5 10 0.78 - 1.5-5 -1.72 - 6.35 22.11-25.1.5 .Jl.8fi - !).).12 
5 11 1.35 - 2.31 3.39 - 4.81 19..15 - 22.:J.) !)0.5!) - 54 .] 2 
5 12 0.73 - 1.47 :1 .. 54 - 4.99 ] 9.1 9 - 22.08 48.3-1 . 51.!}2 
5 13 0.59 - 1.28 1.91 - 3.02 18 .11 - 20.9.5 48.61-52.19 
5 14 0.43 - 1.04 1.59- 2.61 16.7.1 - 19.51 47.08 - 50.G6 
5 15 0.10 - 0 .50 1.15 -- 2.05 11.46 - 17.07 4!>.88 - 49.-15 
131 
Number Number Estimated Probability Confidence Interval 
of PE of )ink (Confidence level = 95%) 
Faults Faults 5 X ,5 6 X 6 10 X 10 20 X 20 
6 0 0.00-0.00 1.65- 2.69 15.78- 18.48 45.01 - 48.59 
6 l 0.00-0.00 1.50- 2.50 15.91 - 18.62 43.59-47.15 
6 2 0.00 - 0.00 I 1.27 - 2.20 16.27- 19.00 43.55 - 47.11 
6 3 0.00-0.00 1.30- 2.24 13.85- 16.42 42.56 - 46.11 
6 4 0.00-0.00 1.38- 2.35 16.53 - 19.27 40.20-43.73 
6 5 u.oo- 0.00 1.21 - 2.12 14.56-17.17 41.06- 44.60 
6 6 0.00-0.00 0.70- 1.43 14.53-17.14 42.46 - 46.01 
6 7 0.00-0.00 0.75- 1.51 11.89- 14.31 40.33-43.87 
6 8 0.00- o.oo 0.64- 1.36 12.79-15.?8 39.01 - 42.53 
6 9 0.00-0.00 0.81 - 1.59 11.00- 13.::!-1 40.30-43.83 
6 10 0.00-0.00 0.40- 1.00 10.87-13.20 36.86- 40.34 
6 11 0.00-0.00 0.53- 1.20 10.45- 12.75 38.05-41.55 
6 12 0.00- 0.00 0.40- 1.00 8.70-10.83 37.35-40.85 
6 13 0.00-0.00 0.22- 0.71 8.42- 10.51 36.10-39.57 
6 14 0.00- o.oo 0.27 - 0.79 7.94-9.99 34.81 - 38.26 
6 15 0.00 - 0.00 0.22- 0.71 7.60-9.60 34.12 - 37.55 
7 0 0.00-0.00 0.00- 0.00 6.62-8.51 29.31 - 32.62 
7 1 0.00 - 0.00 o.oo- 0.00 6.09-7.91 29.80-33.13 
7 2 0.00-0.00 0.00- 0.00 5.65 - 7.42 28.26 - 31.54 
7 3 0.00-0.00 0.00- 0.00 5.31-7.03 29.51 -32.82 
7 4 0.00-0.00 0.00- 0.00 6.43- 8.30 29.90-33.23 
7 5 0.00 - 0.00 0.00- 0.00 6.81-8.72 29.31 - 32.62 
7 6 0.00 -· 0.00 0.00- 0.00 5.03-6.71 28.13 - 31.40 
7 7 0.00-0.00 d.OO- 0.00 5.06-6.74 29.44 - 32.76 
7 8 0.00-0.00 0.00- 0.00 5.06-6.74 27.44- 30.69 
7 9 0.00-0.00 0.00- 0.00 4.59- 6.21 28.03 - 31.30 
7 10 0.00-0.00 0.00- 0.00 4.68-6.32 27.47-30.73 
7 11 0.00-0.00 0.00- 0.00 4.13- 5.67 26.16 - 29.37 
j 12 0.00-0.00 0.00- 0.00 3.45-4.88 26.26-29.47 
7 13 0.00-0.00 0.00- 0.00 3 .36-4.77 24 .33-27.47 
7 14 0.00-0.00 0.00- 0.00 2 .90-4.23 26.03 - 29.23 
7 15 0.00-0.00 0.00- 0.00 2.48- 3. 72 24.33 - 27.4 7 
8 0 0.00-0.00 0.00- 0.00 1.70-2.76 18.80 - 21.67 
8 1 0.00-0.00 0.00- 0.00 1.85-2.95 19.28- 22.18 
8 2 0.00 - 0.00 0.00 - 0.00 2.00- 3.13 18.18- 21.02 
8 3 0.00-0.00 0.00- 0.00 1.53-2.54 17.30-20.10 
8 4 0.00-0.00 0.00 - 0.00 1.44-2.43 18.31 - 21.16 
8 5 0.00 - 0.00 0.00- 0.00 2.06-3.21 19.02- 21.91 
8 6 0.00 - 0.00 0.00- 0.00 1.47-2.46 18.37 - 21.23 
8 7 0.00 - 0.00 0.00- 0.00 1.65-2.69 18.11 - 20.95 
8 8 0.00 - 0.00 0.00- 0.00 1.53-2.54 17.86-20.68 
132 
Number Number Estimated Probability Confidence Interval 
of PE of link (Confidence level = 95%) 
Faults Faults 5x5 6x6 10 X 10 20 X 20 
8 9 0.00 - 0.00 0.00 - 0.00 1.21 - 2.12 18.ia - 21.60 
8 10 0.00- 0.00 0.00-0.00 0.95- 1.78 17.H5 - 20.78 
8 11 0.00 - 0.00 0.00-0.00 0.98- 1.82 l6.:J:J - 19.07 
s 12 0.00- 0.00 0.00- 0.00 1.12 - 2.01 lG. i2 ·- 19.·\8 
8 13 0.00- 0.00 0.00- 0.00 0.81 - 1.59 l 7.01 - H). it} 
8 14 0.00- 0.00 0.00-0.00 0.89- 1.71 16.85 - 1 H.62 
8 15 0.00- 0.00 0.00- 0.00 1.0·1 - 1.90 1·1.66 - 17 .~8 
9 0 0.00- 0.00 0.00- 0.00 0.00 - 0.26 11.5·1 - 13.9:1 
9 1 0.00- 0.00 0.00- 0.00 0.22 - 0. 71 11.73 ··· I ·1.1 :J 
9 2 0.00- 0.00 0.00- 0.00 0.00- 0.00 11.35 - 1 :J. 72 
9 3 0.00- 0.00 0.00- 0.00 O.:J8 - 0.96 11 .5·1 - I :U):J 
9 4 0.00- 0.00 0.00 - 0.00 0.22-0.71 11.28- 13.65 
9 5 0.00- 0.00 0.00- 0.00 0.15- 0.58 10. 10 -- 12.3() 
9 6 0.00- 0.00 0.00- 0.00 0.30- 0.8'\ I O.Ba - 1 :t27 
9 7 0.00- 0.00 0.00- 0.00 0.25- 0.75 11.57 - 1:J.96 
9 8 0.00- 0.00 0.00-0.00 0.13- O.M 1 o.o7 -· 12.aa 
9 9 0.00- 0.00 0.00 - 0.00 0.10- 0.50 10.5H - 12.88 
9 10 0.00- 0.00 0.00 - 0.00 0.08- 0.45 !l..t3 - I 1.6:1 
9 11 0.00- 0.00 0.00- 0.00 0.08- 0..15 !}.56 - 11.7i 
9 12 0.00- 0.00 0.00- 0.00 0.13- 0.54 9.59 - I 1.81 
9 13 0.00- 0.00 0.00 - 0.00 0.13- O.M 9.69- 11.91 
9 14 0.00- 0.00 0.00-0.00 0.00 - 0. 10 8.93- 11.07 
9 15 o.oo- o.oo I o.oo- o.oo O.l7-0.6:J 9.24 - 11 .<12 
10 0 0.00- 0.00 0.00- 0.00 0.00 - 0.00 6.81 - 8.72 
10 1 0.00- 0.00 0.00- 0.00 0.00- 0.16 .5.68- 7.'15 
10 2 0.00- 0.00 0.00- 0.00 0.00- 0.10 6.18 - 8.02 
10 3 0.00- 0.00 0.00- 0.00 0.00- 0.10 6.49 ·- 8.:17 
10 4 0.00- 0.00 0.00- 0.00 0.00- 0.26 6.31 - 8.16 
10 5 0.00- 0.00 0.00- 0.00 0.00- 0.10 6.02- 7.84 
10 6 0.00- 0.00 0.00- 0.00 0.00 - 0.26 6.02 - 7.81 
10 7 0.00- 0.00 0.00- 0.00 0.03- 0.16 5.43 - 7.17 
10 8 0.00- 0.00 0.00 - 0.00 0.00 - 0.00 .5.62 - 7.:18 
10 9 0.00- 0.00 0.00- 0.00 0.00 - 0.00 6 . .i6 - 8.H 
10 10 0.00- 0.00 0.00- 0.00 0.00- 0.00 5.37 - 7.10 
10 l1 0.00 - 0.00 0.00 - 0.00 0.00- 0. 10 5.34 - 7.06 
10 12 0.00- 0.00 0.00- 0.00 0.00 - 0.00 6.53 - 8.4 1 
10 13 0.00- 0.00 0.00- 0.00 0.00- 0.00 5.46- 7.20 
10 14 0.00- 0.00 0.00- 0.00 0.00 - 0.10 5.18 - 6.89 
10 15 0.00 - 0 .00 0.00 - 0.00 0.00 - 0.00 4.81 - 6.16 
133 
Number Number Estimated Probability Confidence Interval 
of PE of link (Confidence level = 95%) 
Faults Faults 5 X .j I 6x6 10 X 10 20 X 20 
11 0 0.00- 0.00 0.00- 0.00 0.00- 0.00 3.94- 5.46 
1 I 1 0.00- 0.00 0.00- 0.00 0.00- 0.00 2.87- 4.19 
I J 2 0.00- 0.00 0.00- 0.00 0.00- 0.00 2.51 - 3.76 
11 3 0.00- 0.00 0.00- 0.00 0.00-0.00 4.13 - 5.67 
1 1 4 0.00 - 0.00 0.00- 0.00 0.00 - 0.00 2.72- 4.01 
1 1 5 0.00- 0.00 0.00- 0.00 0.00- 0.00 4.87-6.53 ,. 
I 1 6 0.00- 0.00 0.00- 0.00 0.00-0.00 2.60-3.87 
11 7 0.00- 0.00 0.00- 0.00 0.00- 0.00 3.05 - 4.41 
11 8 0.00- 0.00 0.00- 0.00 0.00- 0.00 2.60- 3.87 
11 9 0.00- 0.00 0.00- 0.00 0.00- 0.00 2.93- 4.27 
11 10 0.00 - 0.00 0.00- 0.00 0.00- 0.00 2.96- 4.30 
11 11 0.00- 0.00 0.00- 0.00 0.00- 0.00 3.05 - 4.41 
( 1 12 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 3.21 - 4.59 
11 13 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 2.69 - 3.98 
11 14 0.00- 0.00 0.00- 0.00 0.00-0.00 2.06- 3.21 
11 15 0.00- 0.00 0.00 - 0.00 0.00-0.00 2.03- 3.17 
12 0 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 2.15 - 3.32 
12 1 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 1.15 - 2.05 
12 2 0.00- 0.00 0.00- 0.00 0.00- 0.00 1.15 - 2.05 
12 3 0.00- 0.00 0.00- 0.00 0.00-0.00 1.09 - 1.97 
~~ 4 0.00- 0.00 0.00- 0.00 0.00- 0.00 0.92 - 1.74 
12 5 0.00- 0.00 0.00 - 0.00 0.00- 0.00 0.95 - 1.78 
12 6 0.00 - 0.00 0.00 - 0.00 0.00-0.00 1.38- 2.35 
12 7 0.00- 0.00 0.00- 0.00 0.00-0.00 1.41 - 2.39 
12 8 0.00- 0.00 0.00-0.00 0.00-0.00 1.01 - 1.86 
12 9 0.00 - 0.00 0.00- 0.00 0.00-0.00 1.04 - 1.90 
12 10 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.84 - 1.63 
12 11 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.81 - 1.59 
12 12 0.00- 0.00 0.00- 0.00 0.00 - 0.00 1.01 - 1.86 
l:l 13 0.00- 0.00 0.00- 0.00 0.00-0.00 0.75 - 1.51 
12 14 0.00 - 0.00 0.00- 0.00 0.00- 0.00 1.24- 2.16 
12 15 0.00 - 0.00 0.00 - 0.00 0.00-0.00 0.75- 1.51 
13 0 0.00- 0.00 0.00 - 0.00 0.00 - 0.00 0.51 - 1.16 
13 1 0.00- 0.00 0.00- 0.00 0.00- 0.00 0.48- 1.12 
13 2 0.00- 0.00 0.00 - 0.00 0.00-0.00 0.62- 1.32 
13 3 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.17 - 0.63 
13 ·l 0.00 - 0.00 0.00- 0.00 0.00 - 0.00 0.32 - 0.88 
13 5 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.53 - 1.20 
13 6 0.00 - 0.00 0.00- 0.00 0.00 - 0.00 0.02 - 0.31 
13 1 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 0.27- 0.79 
13 8 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.20- 0.67 
13 9 0.00 - 0.00 0.00 - 0.00 0.00 - 0.00 0.43 - 1.04 
134 
;: 
·.· 
' 
' 
• 
Number Number Estimated Probability Confidence Interval 
of PE of link (Confidence level = 95%) 
Faults Faults 5 X 5 6x6 10 X lO 20 X 20 
13 10 0.00- 0.00 0.00-0.00 0.00- 0.00 0.25- 0.75 
13 11 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.40- 1.00 
13 12 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.32-0.88 
13 13 0.00 - 0.00 0.00-0.00 0.00- 0.00 O..l5 -· 1.08 
13 14 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.48- 1.12 
13 15 0.00- 0.00 0 .00- 0.00 0.00- 0.00 0.53- 1.20 
14 0 0.00 - 0.00 0.00-0.00 0.00- 0.00 0.01  - 0.36 
14 1 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.08- 0.'15 
14 2 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.15-0.58 
14 3 0.00 - 0.00 0.00-0.00 0.00-0.00 0.15 - 0 .58 
14 4 0.00 - 0.00 0.00-0.00 0.00- 0.00 0.10-0.50 
14 5 0.00 - 0.00 0.00- 0.00 0.00- 0.00 o.o2- o.a1 
14 6 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00- 0.00 
14 7 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 0.11 - o.n:J 
14 8 0.00 - 0.00 0.00-0.00 0.00- 0.00 0.15-0.58 
14 9 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00- 0.00 
14 10 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.011 - 0.:16 
14 11 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 0.08- 0..15 
14 12 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00- 0.00 
14 13 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 0.06- OA I 
14 14 0.00 - 0.00 0.00-0.00 0.00- 0.00 0.10- 0.50 
14 15 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.10-0.50 
15 0 0.00- 0.00 0.00- 0.00 0.00- 0.00 o.o4 - o.:J6 
15 1 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00-0.10 
15 2 0.00 ·- 0.00 0.00- 0.00 0.00- 0.00 0.00 .. 0.26 
15 3 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00 - 0.16 
15 4 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00-0. 10 
15 5 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00 - 0.10 
15 6 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00-0.16 
15 7 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00-0.10 
15 8 0.00 - 0.00 0.00-0.00 0.00- 0.00 0.00- 0.26 
15 9 0.00- 0.00 0.00- 0.00 0.00- 0.00 0.00 -0. 10 
15 10 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00 - o.w 
15 11 0.00 - 0.00 0.00- 0.00 0.00- 0.00 0.00- 0.00 
15 12 0.00 -· 0.00 0.00 - 0.00 0.00- 0.00 0.00 - 0.00 
15 13 0.00 - 0.00 0.00 - 0.00 0.00- 0.00 0.00 - 0.00 
15 14 0.00- 0.00 0.00- 0.00 0.00- 0.00 0.00 - 0.00 
Since the probability of ,.~ccurrencc of more number of faults than lhis i!i very 
small, the Table is truncated here. 
135 




