Experimental evaluation of the certification-trail method by Wilson, Dwight S. et al.
N94- 36065
=
L
I
Experimental Evaluation of the
Certification-Trail Method
Gregory F. Sullivan, l Dwight S. Wilson, 2 Gerald M. Masson, 3
Mamoru Itoh, 4 Warren W. Smith, Jonathan S. Kay 5
Dept. of Computer Science, Johns Hopkins Univ., Baltimore, MD 21218
/oc 
Abstract
Certification trails are a recently introduced and promising
approach to fault-detection and fault-tolerance [1, 2, 3, 4]. In
this paper, we report on a comprehensive attempt to assess ex-
perimentaUy the performance and overall value of the method.
The method is applied to algorithms for the following problems:
huffman tree, shortest path, minimum spanning tree, sorting,
and convex hull. Our results reveal many cases in which an
approach using certification-trails allows for significantly faster
overall program execution time than a basic time redundancy-
approach.
We also examine algorithms for the answer-validation prob-
lem for abstract data types. This kind of problem was originally
proposed in [3] and provides a basis for applying the certification-
trail method to wide classes of algorithms. We implemented and
analyzed answer-validation solutions for two types of priority
queues. In both cases, the algorithm which performs answer-
validation is substantially faster than the original algorithm for
computing the answers.
Next we present a probabihstic model and analysis which en-
ables comparison between the certification-trail method and the
time-redundancy approach. The analysis reveals some substan-
tial and sometimes surprising advantages for the certification-
trail method.
1Research partially supported by NSF Grants CCR-8910569 and CCR-8908092 and an
IBM Technology Interchange Program Grant.
2Research partially supported by NSF Grant CCR-8910569 and an IBM Technology
Interchange Program Grant.
3Research partially supported by NASA Grant NSG 1442 and an IBM Technology
Interchange Program Grant.
4Visiting Scholar, Matsushita Electronic Components Co.
SCurrently at Dept. of Computer Science, University California San Diego
https://ntrs.nasa.gov/search.jsp?R=19940031558 2020-06-16T10:46:03+00:00Z
Bi .
=
= :
=-:-
Finally we discuss the work our group has performed on the
design and implementation of fault injection testbeds for experi-
mental analysis of the certification trail technique This work em-
ploys two distinct methodologies: software fault injection (mod-
ification of instruction, data, and stack segments of programs on
a Sun Sparcstation ELC and on an IBM 386 PC) and hardware
fault injection (control, address, and data fines of an Motorola
MC68000-based target system pulsed at logical zero/one values).
Our results indicate the viability of the certification trail tech-
nique. We also believe the tools we have developed provide a
solid base for additional exploration.
Keywords: Software fault tolerance, certification trails, error
monitoring, design diversity, data structures.
1 Introduction
Certification trails are a recently introduced and promising approach to
fault-detection and fault-tolerance [1, 3]. In this paper, we report on a com-
prehensive attempt to assess experimentally the performance and overall
wlue of the method. We have implemented several fundamental algorithms
together with versions of the algorithms which generate and utilize certifica-
tion trails. Specifically, algorithms for the following problems are analyzed:
huffman tree, shortest path, minimum spanning tree, sorting, and convex
hull. Our results reveal many cases in which an approach using certification
trails allows for significantly faster overall program execution time than a
basic time redundancy approach.
We also examine algorithms for the answer-validation problem for ab-
stract data types. This kind of problem was originally proposed in [3] and
provides a basis for applying the certification-trail method to wide classes of
algorithms. For this paper we implemented and analyzed answer-vafidation
solutions for two abstract data types. The first solution is for a simplified
priority queue which allows insert, min and deletemin operations, and the
second solution is for a priority queue which allows insert, rain, delete and
deletemin operations. In both cases, the algorithm which performs answer-
validation is substantial faster than the original algorithm for computing the
answers.
This paper next presents a simple probabifistic model and analysis which
enables comparison between the certification-trail method and the time-
ui W
m
u
redundancy approach. The analysis shows that when the certification-trail
method has a smaller execution time than the time-redundancy approach
it yields strictly superior performance. This means the method has both
a a smaller probability of error and a smaller probability of undetected
error. Surprisingly, the analysis also reveals the intriguing result that the
certification-trail method often can display superior performance even when
the method has the same execution time or a longer execution time than the
time-redundancy approach. This superior behavior stems from the typical
assymetry of the execution times of the first and second executions in the
certification-trail method.
The paper next discusses the work our group has performed on the design
and implementation of fault injection testbeds. This work employs two
distinct methodologies: software fault injection and hardware fault injection.
The software fault injection tool is similar to an interactive debugger but
more accurately can be considered an interactive bugger. It allows programs
to be halted and faults to be injected by direct modification of the stack,
data and instruction segments of a program. Output can then be captured
and characterized.
The hardware fault injector is based on injecting faults into an operating
microprocessor. The injection is performed by explicitly setting one or more
pins of the microprocessor to logical zero and/or logical one values. The
timing and duration of the pin setting is under control of a supervisory
processor. The testbed also includes a multi-processor system. This system
consists of three processors which are connected to one another pairwise by
shared banks of dual ported memory. We plan to use this system to conduct
evaluation of systems which utilize concurrent execution of algorithms using
the certification-trail method.
w
2 Introduction to Certification Trails
To explain the essence of the certification-trail technique for software fault
tolerance, we will first discuss a simpler fault-tolerant software method. In
this method the specification of a problem is given and an algorithm to solve
it is constructed. This algorithm is executed on an input and the output is
stored. Next, the same algorithm is executed again on the same input and
the output is compared to the earlier output. If the outputs differ then an
error is indicated, otherwise the output is accepted as correct. This software
fault tolerance method requires additional time, so-called time redundancy
Ng
imw
m
[32, 52]; however, it requires no additional software. It is particularly valu-
able for detecting errors caused by transient fault phenomena. If such faults
cause an error during only one of the executions then either the error will be
detected or the output will be correct. The second possibility, ofundetected
faults, occurs when the output of the execution is unaffected by the faults.
A variation of the above method uses two separate algorithms, one for
each execution, which have been written independently based on the problem
specification. This technique, ca]led N-version programming [16, 12] (in
this case N=2), allows for the detection of errors caused by some faults
in the software in addition to those cause by transient hardware faults and
utilizes both time and software redundancy. Errors caused by software faults
are detected whenever the independently written programs do not generate
coincident errors.
The certification-trail technique is designed to obtain similar types of
error-detection capabilities but expend fewer resources. The central idea,
as illustrated in Figure 1, is to modify the first algorithm so that it leaves
behind a trail of data which we call a certification trail. This data is chosen
so that it can allow the the second algorithm to execute more quickly and/or
have a simpler structure than the first algorithm. As above, the outputs of
the two executions are compared and are considered correct only if they
agree. Note, however, we must be careful in defining this method or else
its error detection capability might be reduced by the introduction of data
dependency between the two algorithm executions. For example, suppose
the first algorithm execution contains an error which causes an incorrect
output and an incorrect trail of data to be generated. Further suppose
that no error occurs during the execution of the second algorithm. It still
appears possible that the execution of the second algorithm might use the
incorrect trail to generate an incorrect output which matches the incorrect
output given by the execution of the first algorithm. Intuitively, the second
execution would be "fooled" by the data left behind by the first execution.
The definitions we give below exclude this possibility. They demand that
the second execution either generate a correct answer or signal that an error
has been detected in the data trail.
3 Formal Definition of a Certification Trail
In this section we will give a formal definition of a certification trail and
discuss some aspects of its realizations and uses.
r .
m
Bw
Input
FirstExecution
I Certjflcati°n:railI EUtPrrU:r
SecondExecu on
=
w
Figure 1: Certification trail method.
Definition 3.1 A problem P is formalized as a relation, i.e., a set of ordered
pairs. Let D be the domain (that is, the set of inputs) of the relation P and
let S be the range (that is, the set of solutions) for the problem. We say an
algorithm A solves a problem P iff for all d E D when d is input to A then
an s E S is output such that (d, s) E P.
Definition 3.2 Let P : D --, S be a problem. A solution to this problem
using a certification trail consists of two functions F1 and F2 with the fol-
lowing domains and ranges F1 : D ---, S x T and F2 : D x T ---, S U {error}.
T is the set of certification trails. The functions must satisfy the following
two properties:
(1) for all d E D there exists s E S and there exists t E T such that
Fl(d) = (s,t) and F2(d,t) = s and (d,s) E P
(2) for all d E D and for all t E T
either (F_(d, t) = s and (d, s) E P) or F2(d, t) = error.
We also require that FI and F2 be implemented so that they map ele-
ments which are not in their respective domains to the error symbol. The
definitions above assure that the error-detection capability of the certification-
trail approach is similar to that obtained with the simple time-redundancy
approach discussed earlier. (That is, if transient hardware faults occur dur-
ing only one of the executions then either an error will be detected or the
output will be correct.) It should be further noted, however, the examples
to be considered will indicate that this new approach can also save overall
execution time.
5
= ,
m
mm
W
w
w
!
Throughout this section we have assumed that our method is imple-
mented with software, however, it is clearly possible to implement the method
with assistance from dedicated hardware. The degree of diversity or inde-
pendence achieved when using certification trails depends on how they are
used. A fuller discussion of this and of the relationship between certification
trails and other approaches to software fault tolerance is contained in the
expanded version of [1].
4 Generalized Priority Queue
Before we present our example algorithms which use certification trails we
must discuss the notion of an abstract data type. An abstract data type has
a well defined data object or set of data objects, and an abstract data type
has a carefully defined finite collection of operations that can be performed
on its data object(s). Each operation takes a finite number of arguments
(possibly zero), and some but not all operations return answers.
Some of the algorithms presented in the next section use the priority
queue abstract data type. In addition, later in this paper the answer-
validation problem for two variants of the priority queue are presented.
Therefore, we now describe the priority queue. The data consists of a set
of ordered pairs. The first element in these ordered pairs is referred to as
the item number and the second element is called the key value. Ordered
pairs may be added and removed from the set, however, at all times the item
numbers of distinct ordered pairs must be distinct. It is possible, though,
for multiple ordered pairs to have the same key value. In this paper the item
numbers are integers between 1 and n, inclusive. Our default convention is
that i is an item number, k is a key value and h is a set of ordered pairs.
A total ordering on the pairs of a set can be defined lexicographically as
follows: (i, k) < (i', k') iff k < k' or (k = k' and i < i'). The abstract data
types we will consider support a subset of the following operations.
member(i) returns a boolean value of true if the set contains an ordered
pair with item number i, otherwise returns false.
insert(i, k) adds the ordered pair (i, k) to the set. We require that no other
pair with item number i be in the set.
delete(i) deletes the unique ordered pair with item number i from the set.
We require that a pair with item number i be in the set initially.
mw
changekey(i, k) is executed only when there is an ordered pair with item
number i in the set. This pair is replaced by (i, k).
deletemin (or deletemax) returns the ordered pair which is smallest (or
largest) according to the total order defined above and deletes this
pair. If the set is empty then the token "empty" is returned.
min (or max) returns the ordered pair which is smallest (or largest) accord-
ing to the total order defined above. If the set is empty then the token
"empty" is returned.
predecessor(i) returns the item number of the ordered pair which immedi-
ately precedes the pair with item number i in the total order. If there
is no predecessor then the token "smallest" is returned. We require
that a pair with item number i be in the set initially.
If an operation violates one of the requirements described above then it is
considered to be ill-formed. Also, if an operation has the wrong number or
type of arguments it is considered to be ill-formed.
Many different types and combinations of data structures can be used
to support different subsets of these operations efficiently.
5 Examples of the Certification Trail Technique
with Timing Data
In this section we evaluate the use of certification trails for five well-known
and significant problems in computer science: the convex hull problem, the
minimum spanning tree problem, the shortest path problem, the Huffman
tree problem, and the sorting problem. We have implemented algorithms
for these problems together with other algorithms which generate and use
certification trails.
We provide a full description of the algorithm for the convex hull problem
which generates a certification trail and a full description of the algorithm
which uses that trail. This material has not appeared in our previous publi-
cations [1, 3]. Because of space considerations the discussion of three of the
other algorithms is abbreviated, but references to previous publications or
technical reports which describe the algorithms more fully are given. The
treatment of the sort algorithm is brief but is detailed enough for the inter-
ested reader to implement the certification-trail method.
iL.
m
m
m
!rob
m
u
The algorithms we have choosen to implement are not always the al-
gorithms which have the smallest asymptotic time complexity. Often the
asymptotically fastest algorithms have large constants of proportionality
which make them slower on the data sizes we examined. We modified and
used some programs from major software distributions such as quicker-sort
from a Berkeley Unix distribution. Other algorithms were based on text-
book discussions. It should be stressed here that this research is exploratory
and we hope to further increase our corpus of algorithm and data-structure
implementations.
5.1 Systems used for timing data
We have collected timing data for the algorithms considered using a Sun
workstation, an IBM 386 PC and a Motorola 68000-based system.
The SUN machine utilized was a SPARCstation ELC with 16MB of
RAM. The system was run as a standalone machine in single user mode
during the timing experiments. Timing data was obtained through the
getrusage() system call; the user times are reported in the data.
Some of the algorithms were also run on an MSDOS machine: a North-
gate 386/33 with 8MB of RAM. The programs were compiled using DJGPP,
DJ Delorie's port of the GNU GCC compiler to MSDOS. This compiler uses
a DOS extender to allow programs to run in protected mode; thus nearly all
of the 8MB in the machine was available, thereby allowing data sets com-
parable in size to those used on the Sun. The programs required no change
to run under MSDOS, though the data generators required minor modifi-
cation because the drand48() family of random number generators was not
available.
Finally some of the algorithms were also run Motorola M68000-based
target system. In addition to the MC68000 microprocessor which served as
the cpu, the system was also was comprised of 512K bytes of RAM, 512
bytes of ROM, and numerous I/O modules to support serial and parallel
communication. A timer module is also included in the system which uses
the 4Mhz clock as a reference so as to provide execution time data for
experiments. This system is discussed in Section 10 relative to fault injection
experiments.
mw
L_
m
m._
w
=
i
w
5.2 Explanation of timing data table entries
Much of the data presented in the timing table is essentially self-explanatory
relative to the certication trail technique and algorithms considered. How-
ever, a brief discussion of the table entries is appropriate.
The Basic Algorithm timing data refers to the execution time of the
algorithm in producing the output without the generation of the certification
trail. All timing data is listed in seconds.
The Generate Certif. timing data refers to the execution time of the al-
gorithm in producing the output with the additional overhead of generating
the certification trail.
The Use Certif. timing data refers to the execution time of the algorithm
in producing the output while using the certification trail.
The Compare timing data refers to the time necessary to compare the
outputs from both two Basic Algorithm runs or from a Generate Certifi-
cation Trial run and a Use Certification Trail run. (Obviously, the value
of the comparison would be the same in each case.) For the some of the
experiments, the data was too small to calculate and is therefore listed as
0.00. In other experiments, the comparison was included in the algorithm
execution timing data and therefore is not separately listed.
The Total Basic timing data is twice the Basic Algorithm timing data
plus the Comparison time (when available) so as to evaluate the classical
time-redundancy approach.
The Total Certif. timing data is the sum of the Generate Certif. timing
data and the Use Certif. data and Comparison data (when available) so as
to evaluate the certification trail approach.
The _ Savings data is percentage of the execution time savings which is
gained by using the certification trail method as compared to the classical
time redundancy method.
For the Huffman tree data, the input size for the Huffman tree program
is the number of nodes. Each node is given a frequency, chosen uniformly
from the integers {1, 2, ..., n}. n was selected to be the number of nodes,
but in fact it's value does not affect the running time of the algorithm. In
order for the algorithm to execute correctly, the sum of the frequencies must
not cause an arithmetic overflow. The certification trail method will detect
this.
For the minimum spanning tree and shortest path tables, there are two
numbers associated with the input size, the first is the number of vertices
in the graph, the second the number of edges. A graph with the required
=
u
--5
w
E
w
L
w
u
r
edges is selected uniformly from the set of all such graphs, then tested for
connectednessl The algorithms will function regardless of connectedness,
but allowing graphs that are not connected would introduce undesirable
variation in the timing data.
For the convex hull tables, the input size is the number of points in the
data set. The points are chosen uniformly from the set of points with integer
coordinates between 0 and 30,000.
For the sorting tables, sorting was timed in two ways. The first set of
results were obtained by sorting integers. To generate a trail, an integer tag
is added to each input integer and an array of these pairs passed to the sort
function. After sorting, the "data" integers are placed in an array, and the
"tag" integers are placed on the certification trail. Thus, the sort call looks
the same as a normal sort function. The time to massage the data in this
manner is included in the cost of the call. This method resulted in only
a small speedup, because of the overhead involved in massaging the data,
and because the sort routine must swap pairs of integers instead of single
integers. The integers were chosen uniformly over the range 0 to 1,000,000.
The second method was to sort an array of pointers to structures. In this
case it was assumed that the structure contained a field that would serve
as the tag. The sort program needed only to fill in this field, and not copy
the structures to a second array. This method results in dramatic speedups.
Integer keys were used, though a more complex key will work as well (in
fact, a more complex key is very likely to increase the speedup achieved).
For the priority queue and generalized priority queue tables, the input
size n is the number of commands executed. The item numbers range from
1 to n (ie. there are as many item numbers as there are commands). The
commands are not chosen with equal probability, but rather the first n/2
are weighted toward insert operations while the second half are weighted
toward the other operations, the weightings remaining the same for all runs.
This weighting is necessary in order to force a large queue.
The timing data displayed in the tables should be considered not only
relative to the overall efficiencies of the certification trail method relative
to classical time redundancy but also relative to the probabilistic analysis
given in Section 9 in which we show that when the certification-trail method
has a smaller execution time than the time-redundancy approach it yields
strictly superior performance. This means the certification trail method has
both a a smaller probability of error and a smaller probability of undetected
error.
10
w
wr_
w
5.3 Convex Hull Example
The convex hull problem is a fundamental one ih computational geometry.
Our certification trail solution is based on a solution due to Graham [24]
which is called Graham's Scan. For basic definitions in computational ge-
ometry see the text of Preparata and Shamos[46]. For simplicity in the
discussion which follows we will assume the points are in so called general
position, e.g., no three points are colinear. It is not hard to remove this
restriction.
Definition 5.1 The convex hull of a set of points, 5', in the Euclidean
plane is defined as the smallest convex polygon enclosing all the points.
This polygon is unique and its vertices are a subset of the points in 5". It is
specified by a counterclockwise sequence of its vertices.
Figure 2(c) shows a convex hull for the points indicated by black dots.
The algorithm given below constructs the convex hull incrementally in a
counterclockwise fashion. Sometimes it is necessary for the algorithm to
"backup" the construction by throwing some vertices out and then contin-
uing. The first step of the algorithm selects an "extreme" point and calls
it Pl. The next two steps sort the remaining points in a way which is de-
picted in Figure 2(a). It is not hard to show that after these three steps the
points when taken in order, Pl, P2,. •., P,_, form a simple polygon; although
this polygon may not be convex. It is possible to think of the algorithm
as removing points from this simple polygon until it becomes convex. The
main FOR loop iteration adds vertices to the polygon under construction
and the inner WHILE loop removes vertices from the construction. A point
is removed when the angle test performed at line 6 reveals that it is not on
the convex hull because it falls within the triangle defined by three other
points. A "snapshot" of the algorithm given in Figure 2(b) shows that q5
is removed from the hull. The angle formed by q4,qs,P6 is less than 180
degrees. This means, qs lies within the triangle formed by q4,Pl,P_. (Note,
ql = Pl-) In general, when the angle test is performed if the angle formed by
qm-1, qm, pk is less than 180 degrees then q,_ lies within the triangle formed
by q,_-l,Pl,Pk. Below it will be revealed that this is the main fact that
our certification trail relies on. When the main FOR loop is complete the
convex hull has been constructed.
Algorithm CONVEXHULL(5')
Input: Set of points, 5', in R 2
11
L .
? :/
p, r'-"
d _ I_p4 pl_epe_ _P'I
I=
r,,__.
m
u
z-
m
=
m
L_
mmm
w
wmm
pl pl
Figure 2: Convex hull example.
Output: Counterclockwise sequence of points in R 2 which define convex hull of S
1 Let Pl be the point with the largest z coordinate (and smallest y to break ties)
2 For each point p (except Pl) calculate the slope of the line through Pl and p
3 Sort the points (except pl) from smallest slope to largest. Call them P2,..., P,_
4 ql :: Pl; q2 := P2; q3 := P3; m = 3
5 FORk=4tonDO
6 WHILE the angle formed by q,_-l,q,,,,pk is > 180 degrees DO m := m - 1 END
7 m:=m+l
8 qm := Pk
9 END FOR
10FOR i = 1 to m DO, OUTPUT(q,) END FOR
END CONVEXHULL
First execution: In this execution the code CONVEXHULL is used.
The certification trial is generated by adding an output statement within the
WHILE loop. Specifically, if an angle of less than 180 degrees is found in the
WHILE loop test then the four tuple consisting of q,n, q,_-l, Pl, Pk is output
to the certification trail. The table below shows the four tuples of points
that would be output by the algorithm when run on the example in Figure
2. The points in the table are given the same names as in Figure 2(a). The
final convex hull points qz,..., q,,, are also output to the certification trail.
Strictly speaking the trail output does not consist of the actual points in R 2.
Instead, it consists of indices to the original input data. This means if the
original data consists of ss, s2,..., s,_ then rather than ouput the element in
R 2 corresponding to si the number i is output. It is not hard to code the
program so that this is done.
12
Lw
Point not on convex huh Three surrounding points
P5 P4, Pl, P6
P4 P3, Pl, P6
P7 P6, Pl, Ps
LL
m
Second execution: Let the certification trail consist of a set of four
tuples, (xl, al, bl, cl), (x2, a2, b2, c2),. •., (xr, at, br, c,) followed by the sup-
posed convex hull, ql,q2,...,q,n. The code for CONVEXHULL is not used
in this execution. Indeed, the algorithm performed is dramatically different
than CONVEXHULL.
It consists of five checks on the trail data.
• First, the algorithm checks for i E {1,...,r) that xi lies within the
triangle defined by ai,bi, and ci.
• Second, the algorithm checks that for each triple of counterclockwise
consecutive points on the supposed convex hull the angle formed by
the points is less than or equal to 180 degrees.
• Third, it checks that there is a one to one correspondence between the
input points and the points in {xl,...,xr} t_J{ql,-..,qm}.
• Fourth, it checks that for i E {1,...,r}, ai, hi, and ci are among the
input points.
• Fifth, it checks that there is a unique point among the points on the
supposed convex hull which is a local extreme point. We say a point
q on the hull is a local extreme point if its predecessor in the counter-
clockwise ordering has a strictly smaller y coordinate and its successor
in the ordering has a smaller or equal y coordinate.
If any of these checks fail then execution halts and "error" is output. As
mentioned above, the trail data actually consists of indices into the input
data. This does not unduly complicate the checks above; instead it makes
them easier. The correctness and adequacy of these checks must be proven.
Because of space limitations we shah not give the proof here.
Time complexity: In the first execution the sorting of the input points
takes O(nlog(n)) time where n is the number of input points. One can show
that this cost dominates and the overall complexity is O(nlog(n)).
13
LIll
-- =
W
J
= :
i
Size Basic Generate Use Compare
Algorithm Certif. Certif.
10000 0.74 0.79 0.11 0.03
20000 1.65 1.75 0.23 0.06
50000 4.64 4.79 0.59 0.14
100000 9.95 10.32 1.19 0.28
Table 1: Huffman Tree on Sun
Size Basic Generate Use Compare
Algorithm Certif. Certif.
1.09 1.32 0.32 0.1010000
20000
50000
2.38 2.91 0.63 0.21
7.01 8.80 1.59 0.50
Table 2: Huffman tree on 386/33
Total
Basic
1.51
3.36
9.42
20.18
Total
Basic
2.28
4.97
14.52
Total % Saving
Certif.
0.93 38.41
2.05 39.28
5.52 41.40
11.79 41.57
Total % Saving
Certif.
1.74 23.68
3.75 24.55
10.89 25.00
It is possible to implement the second execution so that all five checks are
done in O(n) time. /papers/certify3/tabdata/papers/certify3/tabdataChecking
that a point Lies within a triangle is a geometric calculation that can be done
in constant time. Comparing the angle formed by three points to 180 de-
grees can be done in constant time. The thir'd and fourth checks can be
done in O(n) because the certification trail contains indices into the input
data as described above. The uniqueness of the "local extreme" can also be
checked in linear time.
5.4 Minimum Spanning Tree Example
This classic problem has been examined extensively in the literature and
an historical survey is given in [25]. Our approach is applied to a variant
=
u
Size Basic Generate Use Compare
Algorithm Certif. Certif.
10000 1.26 1.29 0.13 0.01
20000 2.71 2.81 0.31 0.01
50000 7.41 7.48 0.70 0.01
100000 15.76 15.87 1.43 0.01
Table 3: Convex Hull on Sun
Total
Basic
2.53
5.43
14.83
31.53
Total % Saving
Certif.
1.43 43.47
3.13 42.35
8.19 44.77
17.31 45.09
14
LJ
w
= .
w
Size Basic
Algorithm
10000 1.79
20000 3.86
50000
100000
10.51
22.40
Generate Use Compare
Certif. Certif.
1.88 0.15 0.01
4.08 0.31 0.01
11.16
23.97
0.78
1.64
0.01
0.01
Table 4: Convex Hull on 386/33
Total
Basic
3.59
7.73
21.03
44.81
Size Basic
Algorithm
1000,10000
Generate
Certif.
Use
Certif.
Compare
100,1000 0.04 0.05 0.01 0.00
200,2000 0.10 0.12 0.02 0.00
500,5000 0.30 0.31 0.06 0.00
0.68 0.72 0.13 0.00
1500,15000 1.10 1.14 0.19 0.00
2000,20000 1.51 1.58 0.27 0.00
2500,25000 1.97 2.00 0.35 0.00
Table 5: Minimum Spanning Tree on Sun
Size Basic Generate Use Compare
Algorithm Certif. Certif.
100,1000 0.04 0.03 0.01 0.00
200,2000 0.08 0.08 0.02 0.00
500,5000 0.26 0.24 0.06 0.00
1000,10000 0.59 0.56 0.13 0.00
1500,15000 0.93 0.90 0.20 0.00
2000,20000 1.29 1.28 0.28 0.00
2500,25000 1.67 1.65 0.36 0.00
Table 6: Shortest Path on Sun
Size Basic Generate
Algorithm Certif.
10000 0.23 0.40
20000 0.51 0.86
50000 1.38 2.35
100000 2.96 4.97
Use Compare
Certif.
0.06 0.01
0.13 0.01
0.35 0.02
0.76 0.O4
Table 7: Integer sorting on Sun
Total
Basic
0.47
1.02
2.78
5.92
Total % Saving
Certif.
2.04 43.18
4.40 43.08
11.95 43.18
25.62 42.83
Total
Basic
0.08
0.20
0.60
1.36
2.20
3.02
3.94
Total % Saving
Certif.
0.06 25.00
0.14 30.00
0.37 38.33
0.85 37.50
1.33 39.55
1.85 38.74
2.35 40.36
Total
Basic
0.08
0.16
0.52
1.18
1.86
2.58
3.34
Total % Saving
Certif.
0.04 50.00
0.10 37.50.
0.30 42.31
0.69 41.53
1.10 40.86
1.56 39.53
2.01 39.82
Total % Saving
Certif.
0.47 0.00
1.00 1.96
2.72 2.15
5.73 3.20
15
w
= .
z
Size Basic
Algorithm
10000 1.02
Generate Use Compare
Certif. Certif.
1.18 0.14 0.04
11.74
20000 2.16 2.49 0.29 0.08
50000 5.67 6.48 0.73 0.22
100000 13.48 1.57 0.44
Table 8: Integer Sort on 386/33
Total
Basic
2.08
4140
11.56
23.92
Total % Saving
Certif.
1.36 34.62
2.86 35.00
7.43 35.73
15.49 35.24
I
m
Size
10000
20000
Basic
Algorithm
0.32
0.71
Generate
Certif.
0.33
0.72
Use
Certif.
0.03
0.07
0.18
Compare
0.01
0.01
50000 1.97 1.99 0.02
100000 4.32 4.37 0.38 0.05
Table 9: Pointer sorting on Sun
Size Basic Generate Use Compare
Algorithm Certif. Certif.
10000 1.08 1.15 0.07 0.03
20000 2.41 2.41 0.16 0.07
50000
100000
6.37
13.29
6.38 0.42 0.22
13.33 0.89 0.43
Table 10: Pointer Sort on 386/33
Total
Basic
0.65
1.43
3.96
8.69
TotM
Basic
2.19
4.89
12.96
27.0i
Total %Saving
Certif.
0.37 43.07
0.80 44.05
2.19 44.69
4.80 44.76
Total % Saving
Certif.
1.25 42.92
2.64 46.01
7.02 45.83
14.65 45.76
Size
10000
20000
50000
Basic Generate Use Compare
Algorithm Certif. Certif.
0.86 0.83 0.14 0.01
1.92 1.87 0.28 0.01
5.32 5.37 0.69 0.02
Table 11: Data structs on Sun
Total
Basic
1.73
3.85
10.64
Total % Saving
Certif.
0.98 43.35
2.16 43.89
6.08 42.85
16
w
Size
8
16
32
64
128
256
512
Basic
Algorithm
0.075
0.215
0.561
1.330
3.120
Generate
Certif.
0.091
0.248
0.629
1.468
3.398
Use
Certif.
0.026
0.054
0.111
0.224
0.450
0.9037.225 7.783
16.270 17.388 1.808
Total
Basic
0.151
0.430
1.122
2.660
6.240
14.450
32.540
Total % Saving
Certif.
0.117 28.7
0.302 42.4
0.740 51.6
1.692 57.2
3.848 62.2
8.686 66.4
19.196 69.5
Table 12: Huffman Tree on 68000-based system
w
n
i
u
w
u
Size
Nodes Edges
10 15
10 20
10 25
5O 75
50 100
50 125
100 150
100 200
100 250
500 750
5OO 1000
500 125o
1000 1500
1000 200O
1000 2500
1500 225O
1500 3000
Basic
Algorithm
Generate
Certif.
Use
Certif.
Total
Basic
0.053 0.054 0.055 0.106
0.071 0.072 0.073 0.142
0.088 0.089 0.176
0.323
0.427
0.320
0.423
0.090
0.309
0.400
0.464
0.602
0.789
0.938
0.492 0.496
0.652 0.658
0.874 0.881
1.036 1.045
3.588 3.617 3.047
4.780 4.817 3.955
5.656 5.698 4.717
7.474 7.533 6.115
9.902 9.977 7.919 19.803
11.830 11.917 9.517 23.660
11.503 22.8309.157
11.802
11.415
14.967 15.077
Total
Certif.
0.109
0.145
0.179
% Saving
-2.5
-1.7
-1.5
0.639 0.632 1.2
0.826 2.3
1.983
6.664
8.772
10.415
13.649
17.895
21.434
20.660
0.960 2.5
1.260 3.6
1.671 4.6
4.5
0.846
O.984
1.305
1.748
2.073
7.176
9.560
11.311
14.949
29.933 26.879
7.7
9.0
8.6
9.5
10.7
10.4
10.5
11.4
Table 13: Min Spanning Tree on 68000-based system
17
M! !]
u
of the Prim/Dijkstra algorithm [47, 18] as explicated in [54]. We provide a
definition of the problem below. For more information on the graph theoretic
terminology used in this problem and others the reader may consult [54, 17].
Definition 5.2 Let G = (V, E) be a graph and let w be a positive rational
valued function defined on E. A subtree of G is a tree, T(V l, El), with
V I C_ V and E' C.C_E. We say T spans V _ and V t is spanned by T. If V I = V
then we say T is a spanning tree of G. The weight of this tree is _eE' w(e).
A minimum spanning tree is a spanning tree of minimum weight.
The problem is to input a graph with edge weights and output a mini-
mum spanning tree. The algorithm for this problem which has the fastest
asymptotic time complexity uses fusion trees and is given in [20]. This al-
gorithm however appears to have a large constant of proportionality. Other
asymptotically fast algorithms [22] also appear to be handicapped by large
constants of proportionality. A fuller discussion of the two algorithms we
employ for generation and use of a certification trial is given in [1].
5.5 Shortest Path Example
This is another classic problem which has been examined extensively in the
literature. Our approach is applied to a variant of the Dijkstra algorithm
[18] as explicated in [54]. We are concerned with the single source problem,
i.e., given a graph and a vertex s, find the shortest path from s to v for
every vertex v.
The algorithm for this problem which has the fastest asymptotic time
complexity uses fusion trees and is given in the same paper which we cited
earlier when considering the minimum spanning tree problem[20]. This al-
gorithm however appears to have a large constant of proportionality. Our
solution employing the certification trail method is very closely based on the
solution we gave for the minimum spanning tree problem [1].
5.6 Huffman Tree Example
This is another old algorithmic problem and one of the original solutions
was found by Huffman[30]. It has been used extensively to perform data
compression through the design and use of so called Huffman codes. These
codes are prefix codes which are based on the Huffman tree and which
yield excellent data compression ratios. The tree structure and the code
design are based on the frequencies of individual characters in the data to
18
ww
m
w
be compressed. Here we are concerned exclusively with the Huffman tree.
See [30] for information about the coding application.
Definition 5.3 The Huffman tree problem is the following: Given a se-
quence of frequencies (positive integers) f[l], f[2],..., f[n], construct a tree
with n leaves and with one frequency value assigned to each leaf so that
the weighted path length is minimized. Specifically, the tree should mini-
mize the following sum: _I,ELEAF len(i)f[i] where LEAF is the set of leaves,
len(i) is the length of the path from the root of the tree to the leaf li, f[i] is
the frequency assigned to the leaf li.
The method we employ to generate and use a certification trail is detailed
in the following technical report [2].
5.7 Sorting Example
This important problem has a massive literature. In this section we will
discuss how to apply the certification trail approach to the sorting problem.
Let us assume that the sorting algorithm takes as input an array of n ele-
ments and outputs an array of n elements. The algorithm is supposed to
place the data into non-decreasing order.
To design a certification trail algorithm we must discover the nature of
the data that should be included in the certification trail to allow quick
computation of the final output sorted array. Suppose that we decide to
use the output array itself as the certification trail. We note that it is easy
to check that this array is in non-decreasing order by simply performing a
single pass over the array. Unfortunately, it is considerably more difficult to
make sure that this array contains exactly the same elements as the original
input array. Indeed, this problem has a lower bound time complexity of
_(nlog(n)) in a comparison based model.
Because of this difficulty we use the permutation of the elements defined
by the input and output data arrays as the certification trail. To compute
this permutation we allocate a new array of size n called permute which
is initialized by setting its ith element to i. (Alternatively, we add a new
field to pre-existing structures when structures are being sorted.) Each time
the sort algorithm exchanges two elements the corresponding elements in
the permute array are also exchanged. (If structures are being used then
this happens automatically.) This approach works with all sort algorithms
which are based on exchanging array elements. The code below shows how
19
z -
' w
the permute array is used to rapidly recompute the final sorted output array
and how the permute array itself is checked.
Algorithm SORT USING TRAIL
Input: Arrays indata[1..n] and permute[1..n]
Output: outdata[1..n] containing the data in indata sorted into non-decreasing order
The first part of the algorithm checks that the permute wlues are in the
proper range and constructs the output array.
1 FORi:= ltonDO
2 IF permute[i] > n or permute[i] < 1
3 THEN OUTPUT("Error: not a permutation") STOP
4 ELSE outdata[i] := indata[permute[i]]
5 END FOR
m
m
The next part of the algorithm checks that the output array is properly
ordered.
6 FOR i := 2 to n DO
7 IF outdata[i - 1] > outdata[i] THEN OUTPUT("Error: decreasing value") STOP
8 END FOR
The final part of the algorithm checks that the permute array defines a
proper permutation, i.e., each element is mapped to exactly one element.
9 FOR i := 1 to n DO present[i] = FALSE END
10 FORi:= ltonDO
11 IF present[permute[i]] = TRUE
12 THEN OUTPUT("Error: not a permutation") STOP
13 ELSE present[permute[i]] := TRUE
14 END FOR
END SORT USING TRAIL
Our experimental work on the Sun was based on a variant of quicksort
[26] which is called quickersort [50]. The implementation of this algorithm
that we used was provided by a Berkeley UNIX software distribution for
the Sun. Our experimental work on the IBM PC was based on a quicksort
algorithm implemented as part of a Gnu library of functions.
2O
= y.
u
W
F
W
=
w
m
6 Answer-Validation Problem for Abstract Data
Types
The next few sections of this paper are concerned with the answer-validation
problem for abstract data types. This kind of problem was originally pro-
posed in [3] and provides a basis for applying the certification-trail method
to wide classes of algorithms. Because of space limitations we will not discuss
the details of how this can be done.
Below, we define the answer-validation problem. Next, we give two ex-
ample algorithms for the answer-validation problems. The first algorithm
is for a priority queue which allows insert, min and deletemin operations.
The second algorithm is for a priority queue which allows insert, min, delete
and deletemin operations. In the next section experimental data on the
execution times of these algorithms is presented.
For each abstract data type we define an answer-validation problem. In-
tuitively, the answer validation problem consists of checking the correctness
of a sequence of supposed answers to a sequence of operations performed on
the abstract data type. More formally, the input to the answer-validation
problem is a sequence of operations on the abstract data type together with
the arguments of each operation. In addition, the sequence contains the
supposed answers for each of the operations which return answers. In par-
ticular, each supposed answer is paired with the operation that is supposed
to return it. Examples of such inputs are given in the columns labelled
"Operation" and "Answer" table 15.
The output for the answer-validation problem is the word "correct" if
the answers given in the input match the answers that would be generated
by actually performing the operations. The output is the word "incorrect"
if the answers do not match. It is also useful to allow the output word to
say "ill-formed". This output is used if the sequence of operations is ill-
formed, e.g., an operation has too many arguments or an argument refers
to an inappropriate object.
The answer-validation problem is similar to the idea of an acceptance
test which is used in the recovery-block approach [48, 6] to software fault
tolerance. The main difference is that an answer-validation problem is de-
pendent upon a sequence of answers, not just an individual answer. Hence,
if an incorrect answer appears in the sequence, it may not be detected imme-
diately. It is guaranteed, however, that an incorrect answer will be detected
at some point during the processing of the entire sequence. By allowing
21
wm
m
m
m
for this latency in detection, it is possible to create a much more efficient
procedure for solving the answer-validation problem.
The most important aspect of the answer-validation problem is that it
is often possible to check the correctness of the answers to a sequence of
operations much more quickly than actually calculating what the answers
should be from scratch. In other words, the answer-validation problem has a
smaller time complexity than the original abstract-data-type problem. This
speedup is very useful in fault-detection applications.
It is possible to run an answer-validation algorithm for some abstract
data type concurrently with some algorithm which uses the abstract data
type. The answer-validation algorithm could act as a monitor making sure
that all interactions with the abstract data type are handled correctly. This
is valuable because many algorithms spend a large fraction of their time
operating on abstract data types. Note, the overhead of this monitor is less
than the overhead of actually performing the data-type operations a second
time.
7 Answer Validation for Priority Queue
We will first consider the priority-queue abstract data type which allows
only three operations: insert, min and deletemin. An example of a sequence
of such operations appears in table 14. Many different data structures can
be used to implement priority queues including heaps [61]; and balanced
search trees such as AVL trees [5], red-black trees [27], or b-trees [13]. It
is possible to process a sequence of O(n) operations in O(nlog(n)) time
using the data structures above. Furthermore, there is a lower bound of
fl(nlog(n)) because it is possible to sort using a priority queue. Remark-
ably, the answer-validation problem can be solved using only O(n) time, as
documented below.
The algorithm which we present in this section is the same as that given
in [3]. It is necessary to include a description of this algorithm because the
algorithm in the next section (which has not appeared before) builds on this
algorithm.
Each operation is time-stamped, i.e., the operations are assigned integers
sequentially starting with 1 which is easy to do with a counter. The answer-
validation algorithm uses a stack called answerstack. The contents of this
stack are illustrated in table 14. The top of the stack is on the left in table 14.
Let us consider the kinds of tests that an answer-validation algorithm
w 22
uL :
w
Time Operation Answer Insert time
1 insert(6,300)
2 insert(2,404)
3 insert(3,250)
4 deletemin (3,250) 3
5 insert(10,248)
6 insert(12,245)
7 insert(4,260)
8 min (12,245) 6
9 insert(13,140)
10 insert(5,142)
11 deletemin (13,140) 9
12 deletemin (5,142) 10
13 deletemin (12,245) 6
14 deletemin (10,248) 5
15 deletemin (4,260) 7
Stack used in validation
(3,250,4)
(12,245,8), (3,250,4)
(13,140,11),
(5,142,12),
(12,245,8), (3,250,4)
(12,245,8), (3,250,4)
(12,245,13),(3,250,4)
(10,24S,14),(3,250,4)
(4,260,15)
Table 14: Sequence of Priority Queue operations illustrating answer valida-
tion algorithm
23
ww
m
u
i
w
W
m
m
=
n
for a priority queue might perform. Suppose (i,k) is the answer to some
min or deletemin operation. Further, suppose (il,k ') was the answer to a
previous min or deletemin operation. If the priority queue is correct then
either (i,k)>_(i',k') or (i,k) was inserted after the answer (i',k') was given. **
multiple insertions possible?* This suggests that the time of insertion for an
element and the time of an answer should be recorded and the algorithm
below does this. Unfortunately, if an algorithm compares an ordered pair
which has been given as an answer against all previous answers then the
algorithm complexity is at least O(m2). To avoid this a stack called the
answerstack is used. The answerstack was designed to allow many compar-
isons to be done implicitly and thus the overall complexity of the many tests
is reduced.
Algorithm for Answer Validation for Priority Queue
Input: Sequence of m operations together with arguments and supposed
answers for the priority-queue data type.
Output: "correct", "incorrect" or "ill-formed"
Declarations: Array called inserttime indexed by item number. Array ele-
ments contain either "absent" or a time-stamp. Array called keyvalue in-
dexed by item number. Array elements contain either "absent" or a key
value. Initially, each element in these two arrays contains "absent". Stack
of ordered triples called answerstack. Each ordered triple has the following
form: first element is an item number, second element is a key value, and
third element is a time-stamp, answerstack is initially empty.
First phase: In this phase we process each operation as it appears serially
using the following rules:
Let currenttime refer to the time-stamp of the operation being processed.
insert(i,k): If inserttime[i]_"absent" then output "ill-formed" and stop.
Otherwise, let inserttime[i] = currenttime and let keyvalue[i]=k.
rain (i,k): (where (i,k) is the supposed answer to the deletemin oper-
ation.) If inserttime[i]="absent" or keyvalue[i]_k then output "ill-formed"
and stop.
Otherwise, let (i_,k _) be the item number and key value of the triple on
the top of answerstack (if there is one). Repeatedly pop the stack until
(i,k)<(i',k') or until answerstack is empty.
If answerstack is empty then push the triple (i,k,currenttime) onto an-
swerstack and process the next priority queue operation.
24
wmt
u
If answerstack is non-empty then let the top element be (ir,kr,answertimer).
If inserttime[i]<answertime r then output "incorrect" and stop. Otherwise,
push the triple (i,k,currenttime) onto answerstack and process the next pri-
ority queue operation.
deletemin (i,k): (where (i,k) is the supposed answer to the deletemin
operation.) Perform the same actions as those described for the min opera-
tion. However, just before processing the next priority queue operation, let
inserttime[i]="absent" and let keyvalue[i]="absent'.
Second phase: In this phase we operate on the items which have been
inserted but have never been deleted.
Scan the array inserttime and for each item number for which inserttime[i]_"absent"
construct an ordered triple (i,keyvalue[i],inserttime[i]). Call this set of or-
dered triples remainders.
Use a bucket sort to sort the triples in remainders by their time-stamps, i.e.,
the third element of the ordered triple.
Merge the triples in remainders together with the triples in answerstack so
that they are all ordered by their time-stamps, i.e., the third element of the
ordered triple.
Scan the combined triples to determine if there exist two triples which satisfy
the following: inserttime[i]<answertime' and (i,keyvalue[i])<(i',k'); where
one triple is from remainders and has the form (i,keyvalue[i],inserttime[i])
and where the other triple is from answerstack and has the form (i_,k',answertime_);
If these two triples exist then output "incorrect" and stop. Otherwise output
"correct" and stop.
m
Theorem 7.1 The algorithm for answer validation of the priority queue
abstract data type is correct.
Theorem 7.2 The answer validation algorithm for priority queue has a
time complexity of O(n) .for processing a sequence of O(n) operations.
For proofs of these theorems see [3].
w 25
w
= = ,
w
r
w
8 Answer Validation for Generalized Priority Queue
We next consider the priority-queue abstract data type which allows four
operations: insert, min, deletemin, and delete. An example of a sequence of
such operations appears in table 15.
The algorithm to solve the validation problem for this data type is an en-
hanced version of the algorithm given above for the data type which allowed
only three priority-queue operations.
Algorithm for Answer Validation for Generalized Priority Queue
Input: Sequence of m operations together with arguments and supposed
answers for the priority-queue data type.
Output: "correct", "incorrect" or "ill-formed"
Declarations: All the declartions used in the earlier algorithm are used again.
In addition, a collection of sets called stacksets are used. Each set in stacksets
consists of a set of item numbers (possibly the empty set). There is a one-to-
one correspondence between the sets in stacksets and the ordered triples in
answerstack. Initially, answerstack consists solely of the ordered triple (0,-
oc,-1). Also initially, stacksets contains exactly one set which is the empty
set and which corresponds to (0,-oc,-1).
First phase: In this phase we process each operation as it appears serially
using the following rules:
Let currenttime refer to the time-stamp of the operation being processed.
insert(i,k): Perform the same actions as those given earlier for the insert
operation. In addition, add the item number i to the set in stacksets corre-
sponding to the top element in answerstack.
min (i,k): (where (i,k) is the supposed answer to the deletemin opera-
tion.) Perform the same actions as those given earlier for the rain operation.
In addition, if any elements are popped off of answerstack then the sets in
stacksets corresponding to these elements are unioned together to form a
new set. This new set is placed in correspondence with the new top element
of answerstack.
deletemin (i,k): (where (i,k) is the supposed answer to the deletemin
operation.) Perform the same actions as those given for the min opera-
tion described immediately above. In addition, remove the item number
i from the set in stacksets which contains it. Further, before processing
26
mw
Time Operation Answer Insert time Stack used in validation
(o,-_,-1)1 insert(5,310) (s}
(o,-_¢,-1)
2 insert(6,210) {5,61
3 insert(8,280) (0,-0¢,-1)
{5,6,6}
4 min (6,210) 2 (6,210,4)
• {s,6,s}
5 insert(9,190) (6,210,4)
{5,6,8,91
6 min (9,190) 5 (9,190,6), (6,210,4)
{5,6,8,9}
7 insert(2,275) (9,190,6), (6,210,4)
{2}, {5,6,8,9}
8 delete(8) 3 (9,190,6), (6,210,4)
{21, {5,6,9}
9 insert(12,170) (9,190,6), (6,210,4)
{2,12}, {5,6,9}
10 insert(14,400) (9,190,6), (6,210,4)
{2,12,14}, {5,6,9}
11 deletemin (12,170) 9 (12,170,11), (9,190,6), (6,210,4)
{2,14}, {5,6,9 /
12 insert(3,290) (12,170,11), (9,190,6), (6,210,4)
{3/, {2,141, {5,6,9/
13 insert(7,330) (12,170,11), (9,190,6), (6,210,4)
{3,7/, {2,14}, {5,6,9}
14 insert(X5,200) (12,170,11), (9,190,6), (6,210,4)
{3,7,151, {2,14/, {5,6,9t
15 delete(9) 5 (12,170,11), (9,190,6), (6,210,4)
{3,7,15/, {2,141, {5,61
16 deletemin (15,200) 14 (15,200,16), (6,210,4)
{2,3,7,14}, {5,6}
17 delete(7) 13 (15,200,16), (6,210,4)
{2,3,14}, {5,6}
18 ddetemin (6,210) 2 (6,210,16)
{2,3,5,14}
19 delete(14) 10 (6,210,18)
{2,3,5}
Table 15: Sequence of Priority Queue operations illustrating answer valida-
tion algorithm
27
mz :
m
the next priority queue operation, let inserttime[i]="absent" and let key-
value[i] = "absent".
delete(i): If inserttime[i]="absent" or keyvalue[i]="absent" then output
"ill-formed" and stop.
Otherwise, let inserttime=inserttime[i] and let k=keyvalue[i]. Next, let
inserttime[i]="absent" and let keyvalue[i]="absent'.
Now, let (il,kl,answertime r) be the ordered triple which corresponds to
the set in stacksets containing item number i. Next, remove item number i
from the set which contains it.
If answertime'>inserttime and (i,k)>(i',k') then output "incorrect" and
stop.
If answertime'>inserttime and (i,k)_<(i',k') then process the next priority
queue operation.
If (i_,k',answertime _) is the top element of answerstack then process the
next priority queue operation.
Let (i",k",answertime") be the element immediately above (i',k',answertime')
on answerstack.
If (i,k)>(i",k") then output "incorrect" and stop. 0ther_ise, process the
next priority queue operation.
Second phase: In this phase we operate on the items which have been
inserted but have never been deleted.
For this phase one performs the same operations as the second phase de-
scribed earlier.
D
w
Theorem 8.1 The algorithm above for answer validation of the priority
queue abstract data type is correct.
Theorem 8.2 The answer validation algorithm above for priority queue has
a time complezity of O(n) for processing a sequence of O(n) operations.
Proofs omitted for space reasons. It is clear that a priority queue with
operations insert, delete, max, deletemax can also be validated in linear time
by changing the appropriate signs in the algorithm above.
Definition 8.3 Consider a sequence of priority queue operations together
with arguments and supposed answers. The sequence may contain the
following operations: insert, delete, min, deletemin, max, and deletemax.
28
w=
Based on this sequence we define a new sequence called a minimum sequence.
This sequence differs from the original sequence as follows: Each max op-
eration and answer pair is removed from the sequence. Each deletemax
operation and answer pair is replaced by a delete(i) operation where i is the
item number given in the answer to the deletemax operation. Each other
operation remains the same.
We also define a maximum sequence. This sequence differs from the
original sequence as follows: Each rain operation and answer pair is removed
from the sequence. Each deletemin operation and answer pair is replaced
by a delete(i) operation where i is the item number given in the answer to
the deletemin operation. Each other operation remains the same.
Theorem 8.4 Consider a sequence of priority queue operations together
with arguments and supposed answers. The sequence may contain the fol-
lowing operations: insert, delete, min, deletemin, ma.x, and deletemax. The
answers given for this sequence are correct if and only if the answers given
for the corresponding minimum and maximum sequences are both correct.
This theorem allows us to define an algorithm which solves the answer-
validation problem for general priority queue.
W
m
u
9 Probabilistic Model
We will now present a simple probabilistic model with accompanying analy-
sis which will permit a comparison between of our certification-trail method
and the classical time-redundancy approach [32, 52]. The analysis shows
that when the certification-trail method has a smaller execution time than
the time-redundancy approach it yields strictly superior performance. This
means the certification trail method has both a a smaller probability of er-
ror and a smaller probability of undetected error. Surprisingly, the analysis
also reveals the intriguing result that the certification-trail method often can
display superior performance even when the method has the same execution
time or a longer execution time than the time-redundancy approach. This
superior behavior stems from the typical assymetry of the execution times
of the first and second executions in the certification-trail method.
We make the following assumptions.
i. Errors are distributed exponentially with parameter )_.
29
ii. If errorsoccurduringonly onephaseof the execution,then they are
detected.
iii. If errorsoccurin both phasesof anexecutionthe)' arenot detected.
Forsolutionsto a problemwith run timesa and b, we therefore have:
Pr{correct} = e -x(a+b)
Pr{detected} = e-_=(1 - e -xb) + e-Xb(1 - e -x_)
= e -)'a ..1_e -)'b _ 2e-_(=+b)
Pr{undetected} = (1- e->'=)(1- e-'xb)
= 1 - e -_ - e -xb + e -)'(a+b)
= 1 - Pr{correct}- Pr{detected}
Given two solutions for a problem, we say that the first is strictly superior
to the second iff:
w
Pr,{correct} > Pr2{correct}
Pr,{correct} > Pr2{correct}
and
or
and
Prl{undetected} < Pr2{undetected}
Prl{undetected} < Pr2{undetected}
This implies that the run time of the first solution is no greater than
that of the second solution.
Observation 1 Suppose there are two solutions (using certification trails)
to a problem, such that each solution runs in two phases, and the combined
run times of phases is the same for both solutions. Then the solution with
the greater time imbalance between phases is strictly superior.
Proof: Let 2a = the run time . Let a + b the run length of the first
phase of the first method, and a + c be the run time of the first phase of
the second method. Then the second phases have times of a - b and a - c
respectively. Assume b < c.
Since the total run time is the same for both solutions, we have Prl {correct} =
Pr2{correct} = e -)_2a, so we need only show that Prl {detected} < Pr2{detected},
ie.
3O
we-:q=+b)(1 _ e-_(=-b)) + e-_(=-_)(1 _ e-_(=+_))
e-:q _+b) + e-'\(_-b)
e -'xb + e )_b
Setting x = e_b and y = e _c we want
< e-:q_+_)(1 _ e-_(=-_)) + e-_(_-¢)(1 _ e-_(_+_))
< e--\(_+_) + e-_(_-_)
< e -)'c + e )_c
1 1
x+- < y+-
x y
1 1
x y
y-x
xy
< y-x
< y--x
forl<x<y
7
W
w
Corollary 1 Given a basic algorithm for a problem, a certification trail
method is superior to running the basic algorithm twice if the total run time
is no greater than twice that of the basic algorithm.
The above statements apply to the situation of a single execution of a
solution. A more interesting case is to iterate the solution until no errors are
reported, that is we either arrive at the correct answer, or have undetected
errors.
Let Prite_{Correct} denote the probability of finding a correct solution
in the iterated scheme and Prit_{undetected} denote the probability of
accepting an incorrect run.
Note that we repeat a run only when errors are detected, so if we obtain
the correct answer on the n - th run, the previous n - 1 runs must have
resulted in detected errors. Thus it is clear that:
Similarly,
Prit_r{correct}
O0
= Pr{correct} _ Pr{detected}
i=O
Pr{correct}
- 1- Pr{deteeted}
Pr{undetected}
Prit¢_{undetected} = 1 - Pr{detected}
31
= =
w
w
For the iterated scheme, we will say that one method is superior to
another if the probabihty of obtaining the correct answer is larger. Obviously
if a method is superior in the single run sense, it must be superior in the
iterated case. However it is possible for one method to be superior to another
in the iterated scheme, but not in the single run scheme. This means that
a certification trail method may be better than running a basic algorithm
twice, even if the certification trail takes longer to run!
Suppose we have a basic algorithm A with running time a for a particular
problem, and a certification trail method with phases running in times b and
c. Given b, how small must c be, for the certification trail to be superior?
We require:
w
Prc_Tt{correct}
1 - Prcert {detected}
e-;_(b+c)
1 - e -xb - e -_ + 2e-_(_'+_)
e-;_(b+_) _ 2e-_(_+b+_)
e-)'_(e -:_b + e -:_2': _ 2e-_'(,_+b))
> Prb_sic{correct}l - Prb_ic{detected}
e- )_2a
>
1 - 2e -;_a + 2e -:x2a
> e-:x2a - e -_(2a+b) - e-;_(2a + c)
> e-_2=(1- e-_b)
Note that b > a, so e -'\b + e -_2a - 2e -_(=+b) must be positive. So,
!
e-_,2a(1 _ e-,Xb)
e -;_c >
e -_b + e-'X_(1 _ e-,Xb)
1 e-_2a(1 - e -_b)
c < ----£1ne__b+e__2_(l_e_:_b )
Since the argument to In is strictly between 0 and 1, c is well defined for
any choice of a, b, and )_.
In addition to the probability of correctness, we would like to know the
expected running time using the iterated approach. Fortunately, this is
easily determined.
Our probability of stopping on a particular execution is Pr{correct} +
Pr{undetected} = 1 - Pr{detected}. Therefore with that probability we
stop on the first execution, with probability Pr{detected}(1- Pr{detected})
we stop on the second execution, and in general we stop on the nth execution
with probability (1 - Pr{detected})(Pr{detected}) '_-1. This gives us an
expected number of iterations of,
32
mu
u
m
u
w
oo
(1 - Pr{detected}) _(i + 1)Pr{detected} i
i=O
Now,
_--](i + 1)x i - (1 - x) 2
i=0
so we find that the expected number of iterations is,
1
1 - Pr{detected}
Multiplying the run time of a single iteration will give us the expected
running time.
Table 16 shows information for running a basic algorithm. The run time
of a basic algorithm is set to 1 unit of time. The basic algorithm is run
twice and the results compared, we assume that comparator is fast enough
so that the time it takes is negligible (this is justified by the experimental
results), and that it is error free. We compute
i. Prob. Correct - The probability that both phases are error free.
ii. Prob. Detected - The probability that exactly on of the phases contains
an error.
iii. Prob. Undetected - The probability that both of the phases contain
errors.
iV. Iterated Prob Correct - If the basic algorithm is iterated (each itera-
tion is two runs), this is the probability that the terminating result is
correct.
V. Expected Runtime - The expected run time of the algorithm in the
iterated model. For the basic algorithm this is twice the expected
number of iterations.
Tabel 17 illustrates the "breakeven" point for the certification trail ap-
proach. Given a value for _ and a run time b of a trail generating algorithm.
The breakeven point for the run time of the trail checking algorithm is the
33
wm
n
u
m
--=
w
m
!
0.01
0.I0
1.00
0.01
0.01
0.01
0.i0
0.i0
0.i0
i.00
1.00
--_-_-
Basic
Algorithm
Prob
Correct
Prob. Prob.
Detected Undetected
i 0.980199 0.019702
1 0.818731 0.172213
1 0.135335 0.465088
Iter.
Prob.
Correct
0.000099 0.999899
0.009056 0.989060
0.399576 0.253005
Expected
Runtime
2.040197
2.416081
3.738935
Table 16: Balanced Probabilites
Generate Trail BreakevenTrail Checker
1.10 0.909050
1.50 0.666111
2.00 0.498750
1.10 0.908683
1.50 0.661128
2.00 0.487505
1.10 0.905504
1.50 0.614107
2.00 0.379885
Table 17: Certification checker breakeven points
point at which the iterated probability of correctness is the same as for the
"basic" algorithm (which has a run time of 1).
Run times less than this will result in the certification trail solution being
superior. It is interesting to notice that in the total length of the solution at
the breakeven point is greater than 2, ie. running the basic algorithm twice.
Table 18 is similar to the first one, the difference being that this examines
the behavior of certification trail methods for different run times of the two
phases. The meaning of the other columns is identical to the meaning in the
table for basic algorithms. Of interest is the row ,k = 1.00, b = 1.50, c = 0.25.
Compare this with the first table for )_ = 1.00. We see that the certification
method has a greater probability of being correct for a single run and the
total run time is shorter than twice the basic algorithm, yet the expected
iterated run time is larger!
w
w
10 Fault Injection Experiments
A series of hardware fault injection experiments have been conducted during
which combinations of the address, data, and control lines of a Motorola
34
iI
u
n
u
i
=
X Generate
Certif.
0.01 1.10
0.01 1.10
0.01 1.10
0.01 1.50
0.01 1.50
0.01 1.50
0.01 2.00
0.01 2.00
0.01 2.00
0.10 1.10
0.10 1.10
0.10 1.10
0.10 1.50
0.I0 1,50
0.10 1.50
0.10 2.00
0.10 2.00
0.10 2.00
1.00 1.10
1.00 1.10
1.00 1.10
1.00 1.50
1.00 1.50
1.00 1.50
1.00 2.00
1.00 2.00
1.00 2.00
Use
Certif.
0.25
0.50
0.75
0.25
0.50
0.75
0.25
0.50
0.75
0.25
0,50
0.75
0.25
0.50
0.75
0.25
0.50
0.75
0.25
0.50
0.75
0.25
0.50
0.75
0.25
0.50
0.75
Prob Prob. Prob. Iter. Expected
Correct Detected Undetected Prob. Runtlme
Correct
0.986591 0.013382 0.000027 0.999972 1.368311
0.984127 0.015818 0.000055 0.999945 1.625716
0.981670 0.018248 0.000082 0.999917 1.884387
0.982652 0.017311 0.000037 0.999962 1.780827
0.980199 0.019727
0.975310
0.977751 0.022138
0.977751 0.022199
0.024591
0.972875 0.026977
0.000074
0.000111
0.000049
i 0.000099
0.000148
0.999924
0.999886
0.999949
0.999899
0.999848
2.040248
2.300937
2.301082
2.563028
2.826245
0.873716 0.123712 0.002572 0.997065 1.540590
0,852144 0.142776 0.005080 0.994074 1.866490
0.831104 0.161369 0.007527 0.991025 2.205976
0.839457 0.157104 0,003439 0.995920 2.076175
0.818731
0.798516
0A74476 0,006793 0.991771 2.422703
0.010065 0.987553 2.7826530.191419
0,004476 0.9944260.1970080.798516 2.802021
0.778801 0.212359 0.008841 0.988776 3.174033
0.759572 0.227330 0.013098 0.983049 3.559087
0.259240 0.593191 0.147568 0.637254 3.318513
0.201897 0.535609
0.157237 0.490763
0.173774 0.654383
0.305674
0.262495 0.434755 3.445370
0.352000 0.308770 3.632888
0.171843 0.502793 5.063409
0.306876 4.535047
0.409903 0.204539
0.135335
0.105399
0.558990
0.484698 4.366374
0.105399 0,703338 0,191263 0.355283 7.584379
0.082085 0,577696 0,340219 0.194374 5.919905
0.063928 0.479846 0.456226 0.122902 5.286897
Table 18: Unbalanced Probabilites
u
35
-m
wD
r
w
M68000-based target system were pulsed with selected signals of various
types and durations while in the process of executing algorithms. In addition
to the MC68000 microprocessor which served as the cpu, the target also was
comprised of 512K bytes of RAM, 512 bytes of ROM, and numerous I/O
modules to support serial and parallel communication. A timer module is
also included in the target which uses the 4Mhz clock as a reference so as
to provide execution time data for experiments. Finally, a simple operating
system is resident in the ROM of the target which provides programming
and operational support.
The fault injection testbed on which these experiments were performed is
illustrated as the configuration shown in Figure 3. In addition to the target
system, the fault injection testbed contains other modules which perform
the fault injection and data acquisition functions under instruction from
the Operations Control Console. By means of RS232C, SCSI, and GPIB
interfaces, a Macintosh IICX serves as the Operations Control Console per-
mitting fault injections to be precisely executed and resulting error data to
be recorded for later analysis by a SUN SPARCstation 2.
The Operations Control Console also communicates over a VMEbus with
the Testbed Controller which is responsible for overall testbed operation.
The primary component of the Testbed Controller is a MC68030-based unit
with 8 Mbytes of SRAM to store error data from fault injection runs as
communicated to it over the VMEbus from the data acquisition module.
The Testbed Controller also is similarly responsible for the operations of
the fault injection module as determined by commands from the Operations
Control Console.
The fault injection module and the data acquisition module have access
via edge connector pins to the lines of the target system selected for injection
and monitoring, respectively. The fault injections are precisely triggered af-
ter some operator determined delay following the appearance of an operator
pre-selected set of bits on either the address Lines of the address bus or the
data lines of the data bus. Similarly, the durations and frequencies of the
injections are also controlled by the operator. The injections emanate from
a bank of programmable function generators included in the fault injection
module. The precision with which fault conditions are triggered and injected
permits the resulting error conditions which are observed to be repeated (if
necessary) for further monitoring/analysis. The data acquisition module is
also triggered by the same address or data bits that activated the fault injec-
tion module. However, there is no delay associated with the data acquisition
function; transfer of the signals on the lines being monitored by the data
36
wz: =
m
L •
u
acquisition module to the memory of the Testbed Controller commences
immediately the data acquisition module's activation. Data monitored by
the data acquisition module is transmitted directly onto VME bus and then
written into the SRAM of the Testbed Controller.
10.1 Fault injection and error classification in MC68000 tar-
get system
To generally indicate the details of the fault injection experiments using the
target system, the injections and resulting errors can be summarized and
displayed at the Operations Control Console as illustrated in Figure 4.
In the example illustrated in Figure 4, the trigger address for the injection
was selected by the operator to be address 1019E (hexadecimal) in the first
version of Huffman tree program which was to generate both the output
and the certification trail. The actual injection consisted of holding the
lower 4 bits of the data bus at logical zero starting 2 microseconds after
the recognition of the trigger address by the fault injection module and
then maintaining the logical zero on these lines for various durations lasting
between 1 and 10 microseconds. For this example, we see that 5 distinct
error conditions resulted depending on the duration of the injection. The
details of data errors classified as type 2 and type 3 are beyond the scope of
this discussion. Suffice it to say that each such type of data error observed
in this particular experimental run could be interpreted as an inconsistent
labeling of nodes in the certification trail passed to the second program. In
each case, however, it should be emphasized that the execution of the second
program utilizing the certification trail detected the error. The other errors
listed in Figure 4 can be categorized as address errors and illegal instructions.
Our purpose in presenting Figure 4 is only to illustrate an example of
a fault injection run with a subsequent error analysis and classification. In
general, the errors resulting from injections into the target system could be
classified as:
• No error.
• Data output errors
• Certification trail errors
• Addressing errors
• Data value errors
37
Testbed
Controller
VMEbus
r
L--
W
W
m
g
t
GPIB
Controller
GPIB
Operations
Control Console
(Macintosh)
I scsl
Error
Analyzer
(SUN)
Fault
Injection
Module
Function
Generator
RS-232C
Data
Acquisition
Module
Target
(68000-based)
Figure3: Hardware faultLnjectiontestbedforMC68000-based targetsystem
37
Fault Delay Width Error
m
i
xxxxxxx0 7[mmmmI_mmmmmmlml 0 US US no error
.2
.3
.4
.5
1
2
4
4.5
5
5.5
6
7
8
9
10
no error
no error
ADDR TRAP ERROR
ADDR TRAP ERROR
ADDR TRAP ERROR
ADDR TRAP ERROR
ADDR TRAP ERROR
ADDR TRAP ERROR
data_error.2
Certification Error: Inconsistent Labels
data_error.2
Certification Error: Inconsistent Labels
data_error.3
Certification Error: Inconsistent Labels
data_error.3
Certification Error: Inconsistent Labels
data_error.3
Certification Error: Inconsistent Labels
data_error.3
Certification Error: Inconsistent Labels
ILLEGAL INSTRUCTION
Figure 4: Example of output displayed at Operations Control Console for
fault injection run for Huffman tree algorithm program
38
, L
ww
w
U
m
u
m
• Halt generated
• Reset generated
• Non-termination of program
• Program mutilation
Currently, the testbed tools are being expanded to produce automated
injections using suites of fault conditions on the target system.
Software fault injection experiments were also performed in which in-
structions, data, and stack contents were modified using both the Sun Sparc-
station and the 386 machine with which the previously detailed timing data
was collected. The details of these fault injection experiments will be pre-
sented in a companion document.
11 Concluding Discussion
This paper experimentally supplements two previous FTCS papers [1, ?]
which theoretically explore the new fault tolerance technique referred to as
the certification trail method. We have presented experimental timing data
which illustrates the advantages of the certification trail technique over clas-
sical time redundancy. We have further presented analytical results which
further support the significance of the certfication trail technique.
References
[1] Sullivan, G.F., and Masson, G.M., "Using certification trails to achieve
software fault tolerance," Digest of the 1990 Fault Tolerant Computing
Symposium, pp. 423-431, IEEE Computer Society Press, 1990.
[2] Sullivan, G.F., and Masson, G.M., "Using certification trails to achieve
software fault tolerance," Department of Computer Science Technical
Report JHU 89/26, Johns Hopkins University, Baltimore, Maryland,
1989.
[3] Sullivan, G.F., and Masson, G.M., "Certification trails for data struc-
tures," Digest of the i991 Fault Tolerant Computing Symposium, pp.
240-247, IEEE Computer Society Press, 1991.
4O
P'I_IECBrJ¢_ PAGE BLANK I_T FILMED
um
= =
!
l
u
= =
m
m
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
Sullivan, G.F., and Masson, G.M., "Certification trails for data struc-
tures," Department of Computer Science Technical Report JHU 90/17,
Johns Hopkins University, Baltimore, Maryland, 1990.
Adel'son-Vel'skii, G. M., and Landis, E. M., "An algorithm for the or-
ganization of information", Soviet Math. Dokl., pp. 1259-1262, 3, 1962.
Anderson, T., and Lee, P., Fault tolerance: principles and practices,
Prentice-Hall, Englewood Cliffs, NJ, 1981.
Andrews, D., "Software fault tolerance through executable assertions,"
Rec. 12th Asilomar Conf. Circuits, Syst., Comput., pp. 641-645, 1978,
Nov. 6-8.
Andrews, D., "Using excutable assertions for testing and fault toler-
ance," Dig. 9th Annu. Int. Syrup. Fault Tolerant Comput., pp. 102-105,
1979, June 20-22.
Avizienis, A., "Fault tolerance by means of external monitoring of com-
puter systems," Proceedings of the 1981 National Computer Conference,
pp. 27-40, AFIPS Press, 1980
Avizienis, A., "Design diversity - the challenge of the eighties," Digest
of the 1982 Fault Tolerant Computing Symposium, pp. 44-45, IEEE
Computer Society Press, 1982.
Avizienis, A., and Kelly, J., "Fault tolerance by design diversity: con-
cepts and experiments," Computer, vol. 17, pp. 67-80, Aug., 1984.
Avizienis, A., "The N-version approach to fault tolerant software,"
IEEE Trans. on Software Engineering, vol. ll, pp. 1491-1501, Dec.,
1985.
Bayer, R., and McCreight, E., "Organization of large ordered indexes",
Acta Inform., pp 173-189, 1, 1972.
Blough, D., and Masson, G., "Performance analysis of a generalized
concurrent error detection procedure," IEEE Trans. on Computers vol.
39, Jan., 1990.
Blum, M., and Kannan, S., "Designing programs that check their
work", Proceedings of the 1989 ACM Symposium on Theory of Com-
puting, pp. 86-97, ACM Press, 1989.
41
um
n
L-.-
U
w
w
[16] Chen, L., and Avizienis A., "N-version programming: a fault toler-
ant approach to reliability of software operation," Digest of the 1978
Fault Tolerant Computing Symposium, pp. 3-9, IEEE Computer Society
Press, 1978.
[17] Cormen, T. H., and Leiserson, C. E., and Rivest, R. L., Introduction to
Algorithms McGraw-Hill, New York, NY, 1990.
[18] Dijkstra, E. W., "A note on two problems in connexion with graphs,"
Numer. Math. I, pp. 269-271, Sept., 1959.
[19] Eifert, J.B., and Shen, J.P., "Processor monitoring using asynchronous
signatured instruction streams," Dig. 14th Int. Conf. Fault-Tolerant
Comput., pp. 394-399, 1984, June 20-22.
[20] Fredman, M. L., and Willard, D. E., "Trans-dichotomous algorithms for
minimum spanning trees and shortest paths," Proc. 31st IEEE Foun-
dations of Computer Science, pp. 719-725,1990.
[21] Fredman, M. L., and Saks, M. E., "The cell probe complexity of dy-
namic data structures," Proc. 21st ACM Syrup. on Theo. Comp. 1989,
pp. 109-122, 2, 1986.
[22] Gabow, H. N., Galil, Z., Spencer, T., and Tarjan, R. E., "Efficient algo-
rithms for finding minimum spanning trees in undirected and directed
graphs," Combinatorica 6, pp. 109-122, 2, 1986.
[23] Gabow, H. N., and Tarjan, R. E., "A linear-time algorithm for a special
case of disjoint set union," J. of Comp. and Sys. Sci., 30(2), pp. 209-
221, 1985.
[24] Graham, R. L., "An efficient algorithm for determining the convex hull
of a planar set", Information Processing Letters, pp. 132-133, 1, 1972.
[25] Graham, R. L., and Hell, P., "On the history of the minimum spanning
tree problem," Ann. Hist. Comput., pp. 43-47, Jan., 1985.
[26] Hoare, C. A. R., "Quicksort," Computer Journal, pp. 10-15, 5(1), 1962.
[27] Guibas, L. J., and Sedgewick, R., "A dichromatic framework for bal-
anced trees", Proceedings of the Nineteenth Annual Symposium on
Foundations of Computing, pp. 8-21, IEEE Computer Society Press,
1978.
42
m
mm
r
w
u
= =
L
w
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
Gunneflo, U., Karlsson, J., and Torin, J., "Evaluation of error detection
schemes for using fault injection by heavy-ion radiation," Dig. of the
1989 Fault Tolerant Computing Symposium, pp. 340-347, June, 1989.
Huang, K.-H., and Abraham, J., "Algorithm-based fault tolerance for
matrix operations," IEEE Trans. on Computers, pp. 518-529, vol. C-33,
June, 1984.
Huffman, D., "A method for the construction of minimum redundancy
codes", Proc. IRE, pp 1098-1101, 40, 1952.
Iyengar, V.S. and Kinney, L.L., "Concurrent fault detection in micro-
programmed control units," IEEE Trans. Comput., vol. C-34, pp. 810-
821, Sept. 1985.
Johnson, B., Design and analysis of fault tolerant digital systems
Addison-Wesley, Reading, MA, 1989.
"Fault tolerant FFT networks," Dig. of the 1985 Fault Tolerant Com-
puting Symposium, June, 1985.
Kane, J.R. and Yau, S.S., "Concurrent software fault detection," IEEE
Trans. Software Eng. , vol. SE-1, pp. 87-99, March 1975.
Komlbs, J., "Linear verification for spanning trees", Proceedings of the
1984 Symposium on Foundations of Computing, pp. 201-206, IEEE
Computer Society Press, 1984.
Lee, Y.H. and Shin, K.G., "Design and evaluation of a fault-tolerant
multiprocessor using hardware recovery blocks," IEEE Trans. Comput.,
vol. C-33, pp. 113-124, Feb. 1984.
Lu, D., "Watchdog processor and structural integrity checking," IEEE
Trans. Comput., vol. C-31, pp. 681-685, July 1982.
Mahmood, A., Lu, D.J. and McCluskey, E.J., "Concurrent fault detec-
tion using a watchdog processor and assertions," Proc. 1983 Int. Test
Conf.,, pp. 622-628, Oct., 1983.
Mahmood, A. Ersoz, A. and McCluskey, E.J., "Concurrent system level
error detection using a watchdog processor," Proc. 1985 Int. Test Conf.,
pp. 145-152, Nov., 1985.
f
43
QH
m
W
W
w
L
i
[40] Mahmood, A., and McCluskey, E., "Concurrent error detection using
watchdog processors - a survey," IEEE Trans. on Computers, vol. 37,
pp. 160-174, Feb., 1988.
[41] Mahmood, A., and McCluskey, E., "Concurrent error detection using
watchdog processors", IEEE Trans. on Computers, vol. 37, pp. 160-174,
Feb., 1988.
[42] Nair, V., and Abraham, J., "General linear codes for fault-tolerant
matrix operations on processor arrays," Dig. of the 1988 Fault Tolerant
Computing Symposium, pp. 180-185, June, 1988.
[43] Namjoo, M., and McCluskey, E., "Watchdog processors and capability
checking," Digest of the 1982 Fault Tolerant Computing Symposium,
pp. 245-248, IEEE Computer Society Press, 1982.
[44]
[45]
Namjoo, M. "Techniques for concurrent testing of VLSI processor op-
eration," Dig. 1982 Int. Test Conf., pp. 461-468, Nov., 1982.
Namjoo, M. "CERBERUS-16: An architecture for a general purpose
watchdog processor," Dig. Papers I3th Annu. Int. Syrup. Fault Tolerant
Comput., pp. 216-219, June, 1983.
[46] Preparata F. P., and Shamos M. I., Computational geometry: an intro-
duction, Springer-Verlag, New York, NY, 1985.
[47] Prim, R. C., "Shortest connection networks and some generalizations,"
Bell Syst. Tech. J., pp. 1389-1401, Nov., 1957.
[48] Randell, B., "System structure for software fault tolerance," IEEE
Trans. on Software Engineering, vol. 1, pp. 220-232, June, 1975.
[49] Schmid, M., Trapp, R., Davidoff, A., and Masson, G., "Upset exposure
by means of abstraction verefication," Dig. of the 1982 Fault Tolerant
Computing Symposium, pp. 237-244, June, 1982.
[50] Sedgewick, R., "Implementing quicksort programs," Communications
of the ACM, pp. 847-857, 21(10), 1978.
[51] Shen, J.P. and Schuette, M.A., "On-line self-monitoring using signa-
tured instruction streams," Proc. 198g Int. Test Conf.,, pp. 275-282,
Oct., 1983.
44
U
= :
m
D
H
m
u
i
= =
m
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[6o]
[61]
[62]
Siewiorek, D., and Swarz, R., The theory and practice of reliable design,
Digital Press, Bedford, MA, 1982.
Sridhar, T. and Thatte, S.M., "Concurrent checking of program flow in
VLSI processors," Dig. i982 Int. Test Conf., pp. 191-199, Nov., 1982.
Tarjan, R. E., Data Structures and Network Algorithms, Society for
Industrial and Applied Mathematics, Philadelphia, PA, 1983.
Tarjan, R. E., "Efficiency of a good but not linear set union algorithm,"
J. ACM, 22(2), pp. 215-225, 1975.
Tarjan, R. E., "A class of algorithms which require nonlinear time to
maintain disjoint sets," J. of Comp. and Sys. Sci., 18(2), pp. 110-127,
1979.
Tarjan, R. E., and Leeuwen, J. van, "Worst-case analysis of set union
algorithms," J. ACM, 31(2), pp. 245-281, 1984.
Tarjan, R. E., "Applications of path compression on balanced trees",
J. ACM, pp. 690-715, Oct., 1979.
Tomas, S. P. and Shen, 3. P., "A roving monitoring processor for detec-
tion of control flow errors in multiple processor systems," Proc. IEEE
Int. Conf. Comput. Design: VLSI Comput., pp.531-539, Oct., 1985.
Taylor, D., "Error Models for robust data structures," Dig. 20th Annu.
Int. Syrup. Fault Tolerant Comput., pp. 416-422, 1990 June 26-28.
Williams, J. W. J, "Algorithm 232 (heapsort)," Commun. of ACM,
vol.7, pp. 347-348, 1964.
Yau, S.S, and Chen, F.-C., "An approach to concurrent control flow
checking," IEEE Trans. Software Eng., vol. SE-6, pp. 126-137, March
1980.
i
[:4
w 45
w
mAPPENDIX A
=_
m
= :
m
DATA ACQUISITION
MODULE
TECHNICAL MANUAL
Ver. 1.0
F
I_J
U
U
W
r_
m
L_
W
=
m
THE TABLE OF CONTENTS
!
L
b
1. The Experimental System Overview
1.1 System Configuration
1.2 General System Description
1.3 System Customization
2. Data Acquisition Module
2.1 Hardware Overview
2.2 Clock Control
2.3 Address Generator
2.4 Address Bus Buffers and Address Modifier Selector
2.5 Data Transfer Control
2.6 Input Channel Selector and Data Bus Buffers
2.7 VMEbus Master Control
3. Interface Signals
3.1 VMEbus Interface
8.2 Input Channels
Appendix A Schematic Diagrams
Appendix B Parts List
Appendix C DAM Board Layout
Appendix D Copies of Data Sheets
uu
F
w
m
u
m
m
U
1. The Experimental System Overview
This system provides an experimentM environment for recording and ana-
lyzing upset data in computer systems. This chapter provides the information
on the system configuration and general hardware description.
1.1 System Configuration
This experimental system is mainly based on the VMEbus and controlled
by the 68030 CPU hoard. The VMEbus provides a master-slave, asyn-
chronous non- multiplexed data transfer medium. The target system (CPU
Under Test) and the Fault Injection Module are connected by its local bus.
Fig.l.1 shows the experimental configuration. This system's features in-
clude:
• 68030 CPU Board
• Up to 8 Mbyte SRAM Memory Modules
• Floppy Disk and SCSI Bus Controller (FDC/SCSI)
• 80 Mbyte Hard Disk and 3.5" Floppy Disk Drive
• OS-9 Operating System
• Chassis with power supply, cooling fans, and motherboard
• Data Acquisition Module
• CPU Under Test (MC68000 Educational Corn _uter Board)
• Fault Injection Module
• (GP-IB I/F Controller)
• (SUN SPARCstation)
m
mm
L
E i
m
Terninal
©
I
!
I
EWS SUN
SPARCstation
LAN
CPU
(MC-68030)
!
GP-IB
I/F
VMEbu s
i
SRAM
-I
I
I
I
I
I
I
DAM
I
FDC/SCSI
I I
I I
I I
(Up to 81byte) t
RS-232C
CUT
(1C-68000)
I ,1i
I
FIM i
HDD
(80 t[byte)
FDD
(3.5")
Chassis
!
!
I
i
!
!
I
i
I
!
f
I
i
i
I
i
i
i
!
!
I
l
l
!
i
!
!
!
!
!
i
!
!
!
!
!
!
,J
m
Fig. 1.1
DAM:
CUT:
FII:
Data Acquisition lodule
CPU Under Test (Target System)
Fault Injection Iodule
Experinental Configuration
in
m
i
i
i
I
L
I
1.2 General System Description
This section briefly describes the general description of each module of tile
experimental system. For detailed information, refer to the user's manuals
on specific modules.
• 68030 CPU Board
- SYS68K/CPU-33XN (Force Computers Inc.)
- 68030 CPU with 16.7 MHz clock frequency.
- Not equipped with the Floating Point Coprocessor.
- 32-bit high speed DMA controUer for data transfers.
- 1 Mbyte of shared dynamic RAM.
- Two multiprotocol serial I/O channels.
- Up to 2 Mbyte EPROM and up to 512 Kbyte SRAM/EEPROM.
- Real Time Clock with calendar and on-board battery backup.
- Full 32 bit VMEbus master/slave interface.
• Memory Module
- SYS68K/SRAM-6 (Force Computers Inc.)
- 2 Mbyte SRAM on SRAM-6.
- Battery backup for SRAM devices.
- 55ns(typical) Read/Write Access Time.
- Jumper selectab]e access address and address modifier code.
- VMEbus intereface supporting 32 data and 32 address lines.
• Floppy Disk and SCSI Bus Controller
- SYS68K/ISCSI-1 (Force Computers Inc.)
- 68010 CPU for local control.
- 68450 DMA Controller for local transfers.
- SCSI bus interface with the NCR5386S SCSI bus controller.
i
n
w
- SHUGART compatible floppy interface with the WD1772 FDC.
- All I/O signals awilable on P2 connector.
- VMEbus interface supporting A24:D16, D8.
• Mass Storage Module
- SYS68K/MSM-84 (Force Computers Inc.)
- Only VME P1 backplane is required.
- 64 Pin flat cable is used to connect P2 of the ISCSI-1.
- Floppy Disk Driver (Toshiba ND352)
, Disk Size and Capacity: 3.5", 1.0 Mbyte
, Number of Tracks: 160
, Access Time: 79 ms (averAge)
- Hard Disk (Quantum PRO80S)
, Disk Size and Capacity: 3.5", 84 Mbyte
, Number of Cylinders and Heads: 834, 6
* Seek Time: 19 ms (average)
• OS-9 Operating System
- Professional OS-9 (Microware Systems Corporation)
- Multitasking, real time operating system.
- UNiX-like shell and a hierarchical directory/file structure.
- C Compiler, Assembler/Linker, and User-state Debugger.
- pMACS screen-oriented text editor.
• Chassis with power supply, cooling fans, and motherboard
- SYS68K/TARGBT-32 (Force Computers Inc.)
- 19", 7U chassis.
- 500 W power supply to driveVMEbus and mass storagememory.
- Cooling systems with four fans.
- 20 slotJ1-J2 VMEbus Motherboard.
mm
W
w
m
N
N
s Data Acquisition Module
- Up to 8 Mbyte address space.
- Jumper selectable address modifier code.
- 32 Input Channels with data selectors.
- VMEbus compatible data transfers supporting A24:D32, DS.
- VMEbus Master bus control (Non-slot 1)
• CPU Under Test
- MC68000 Educational Computer Board (Motorola Inc.)
- 4 MHz MC68000 16-bit CPU.
- 32 Kbyte of DRAM and 16 Kbyte firmware ROM/EPROM mon-
itor.
- Two serial ports provided for a terminal and a host.
• Fault Injection Module
- Hardware fault injections on IC pin Unes.
- Single/multiple faults of stuck/bridging types with fault duration
varying from 250 ns to _. &q'_,S •
- Application program generated fault injection.
m
E_
u
= :
1.3 System Customization
This section describes the system customization required to implement
the upset analysis experimental system. This also provides information on
the programming of peripherals.
• SYS68K/CPU-33XN
- OS-9/680001 EPROM Installation
, Remove VMEPROM 2 and installEPROMs for OS-9.
• High -- Socket J6, Low -- Socket J4
- EPROM Type Selection
• 27512 EPROM
, Jumper£eld BI: I to 12, 6 to 7
- InterfacingPI/T2 User I/O Port
• Device: MC68230 Parallel Interface/Timer (PI/T)
. Accessible via the 8-bit local I/O bus. Table 1.1 shows the
register layout of PI/T2.
• User I/O port is available on P2 of VMEbus, shown in Table
1.2.
- The Address Map
• The address map of this CPU board is listed in Table 1.3.
• A24: D32, D24, D16, D8 area: SRAM-6, ISCSI-1
• SYS68K/SRAM-6
- Address Modifier Selection
• Standard Supervisor/Non-privileged Data Access
, Address Modifier Code: 3D, 39
, Jumperfleld B4:4 to 15, 2 to 17
- VMEbus Interface
• A24: D32, D16, D8
, Standard Address Mode (A24)
• Address: $XXO00000-- SXX2000000 (2 Mbyte)
• Ju,nperfleld B3:18 to 15, 20 - 30 to 13 - 3
• SYS68K/ISCSI-1
- Address Modifier Selection
• Standard Non-priviledged/Supervisory program and data Ac-
cess.
• Address Modifier Code: 3A, 39, 3E, 3D
• Jumperfield B22:5 to 2, 6 to 1
- VMEbus Interface
• A24: D16, D8
• Address: SXXA00000 w SXXA1FFFF (128 Kbyte)
• Jumperfield B2h 2 to 17, 4 - 7 to 15 - 12
Table 1.1 PI/T2 Register Layout
ADDRESS REGISTER DESCRIPTION
FF8OOEO0
FFSOOEO1
FFSOOE02
FF8OOE06
FFSOOE08
FFSOOEOA
FFSOOEOD
PIT2 PGCR
PIT2 PSRR
PIT2 PADDR
PIT2 PACR
PIT2 PADR
PIT2 PAAR
PIT2 PSR
Port General Control Register
Port Service Request Register
Port A Data Direction Register
Port A Control Register
Port A Data Register
Port A Alternate Register
Port Status Register
z
m..
D
Table 1.2 PI/T2 User I/O Interface Signals
PIN No.
4
5
6
7
8
9
10
11
13
14
15
16
PORT No.
PA0
PAl
PA2
PA3
PA4
PA5
PA6
PA7
H1
H2
H3
H4
IN/0UT
OUT
OUT
OUT
OUT
IN
IN
P2/J2 No.
A29
C29
A30
C30
A31
C31
A32
C32
A27
C27
A28
C28
SIGNAL
READY*
LW/B*
SLCT0*
SLCTI*
ENB0*
ENBI*
Table 1.3 The Address Map
START (HEX)
-- 00000000
00400000
FAO00000
FBO00000
FBFFO000
FCO00000
FCFFO000
FDO00000
END (HEX)
OO3FFFFF
F9FFFFFF
FAFFFFFF
FBFEFFFF
FBFFFFFF
FCFEFFFF
FCFFFFFF
FFFFFFFF
SPACE
I.OMB
3.9 GB
16.0 MB
15.9 MB
64.0 KB
15.9 MB
64.0 KB
DESCRIPTION
Shared Memory
A32: D32, D24, D16, D8
Message Broadcast Area
A24: D32, D24, D16, D8
A16: D32, D24, DI6, D8
A24: D16, D8
A16: D16, D8
System Area
IOS-9 and 0S-9/68000 are trademarks of Microware Systems Corporation.
2VMEPROM is a PDOS based real time monitor.
2. Data Acquisition Module
When the fault is injected from the fault injection module, the data ac-
quisition module is activated and activity data on 8 or 32 observation poiv:s
are synchronously sampled with the clock of the target system and writ ton
into the SRAM memory module.
n
b
am.a
mn
_.1 Hardware Overview
Basically, the data acquisition module generates the address signals from
the clock of the target system and transfers the sampled data to the memory
module via the VMgbus.
A block diagram is shown in Fig.2.1. This board consists of the following
functional blocks:
• Clock Control (CKCTRL)
• Address Generator (ADDGEN)
• Address Modifier Selector (AMS)
• Address Bus Buffers (ABUF)
• Data Transfer Control (DTCTRL)
• Input Channel Selectors (INSLCT)
• Data Bus Buffers (DBUF)
• Bus Master Control (BUSMST)
z
J
m
r...) i
o
o
e-.
0
.el
=w
m
w
i
i
t_4
i
2.2 Clock Control
• Recording Clock Selector
- J1-1, IC1-1
- Selectableby bit 1 and 2 of J1.
• Clock of CPU Under Test: bit I: ON, bit 2: OFF
• 16MHz VME System Clock: bit 1: OFF, bit 2: ON
• Clock Frequency Divider
- J1-2, IC2
- Selectableby bit 3 - 7 of J1 as shown in Table 2.1.
• Qualifier Trigger
- IC1-2, IC3-1, IC10-1
- Trigger: Fault injection signal transferred from FIM.
- The trigger is enabled when ENB1 is high.
• Clear Control
- R1, IC1-3, IC16-1
- Generate Clear Signfl for the Clock Control, Address Generator,
and Data Transfer Control.
- Reset Signals: System Reset, Bus Error, and End Address.
• End Address Selection
- J2-1
- End address: SXXOFFFFF- SXXTFFFFF
- Selectable by bit 1 - 4 of J2-1 as shown in Table 2.2.
= =
m
Table 2.1 Frequency Division Settings
Division bit3 bit4 bit 5 bit6 bit 7
1
2
4
8
16
ON OFF OFF OFF OFF
OFF ON OFF OFF OFF
OFF OFF ON OFF OFF
OFF OFF OFF ON OFF
OFF OFF OFF OFF ON
w
Table 2.2 End Address Selection
End Address bit 1 bit2 bit3 bit4
$XXOFFFFF
SXXIFFFFF
SXX3FFFFF
SXXTFFFFF
ON OFF OFF OFF
OFF ON OFF OFF
OFF OFF ON OFF
OFF OFF OFF ON
m
w_ _
J
m
w
!2
m
2.3 Address Generator
* Address Signal Generator
- IC4, IC5, IC6, IC7, IC8, IC9
- Implement 24-bit synchronous binary counter using a carry-look-
ahead circuit.
- Maximum clock frequency is calculated as follows:
fM._X = 1/(CLKtoRCOtpLH + ENTtsv)
- Address Space
* Up to 8 Mbyte Address Space. Refer to Table 2.3.
* Start address: SXXO00000 (fixed)
* End address: $XXOFFFFF - SXX7FFFFF (selectable)
• Counter Status Output
- IC10-2
- When counters are enabled to count, EN81* is asserted.
Table 2.3 Address Space and End Address
Address Space End Address
I Mbyte
2 Mbyte
4 Mbyte
8 Mbyte
SXXOFFFFF
SXXIFFFFF
SXX3FFFFF
SXXTFFFFF
m
2.4 Address Bus Buffers and Address Modifier Selector
• Address Bus Buffers
- IC12, IC13, IC14
- Three transparent D-latches (74AS573) interface local address fig-
nals with the VMEbus address bus.
- DHBA* places the 24-bit outputs in either a normal logic state or
a high-impedance state.
• Address Modifier Selector
- J2-2, RN, ICll
- 6-bit Codes: Used for an additional decoding parallel to the ad-
dress signals.
- Address Mode: Supports the standard address mode (A24) for
supervisor or nonpriviledged memory access.
* 3E: Standard Supervisor Program Access
* 3D: Standard Supervisor Data Access
* 3A: Standard Non-Priviledged Program Access
, 39: Standard Non-Priviledged Data Access
- Selectable by bit 5 - 10 of J2 as shown in Table 2.4.
Table 2.4 Address Modifier Codes and Settings
HEX Binary bit 5 bit6 bit 7 bit 8 bit9 bit 10
3E
3D
3A
39
111110
111101
111010
111001
OFF OFF OFF OFF OFF ON
OFF OFF OFF OFF ON OFF
OFF OFF OFF ON OFF ON
OFF OFF OFF ON ON OFF
2.5 Data Transfer Control
* Data Transfer Bus Control
- ENB1, DWB*
* IC10-3, IC15-1
* When READY* asserted,both ENB1 and DWB* are latched
to be active.
. LCLR* resetsthe outputs.
- LAS*
, R2, IC10-4, IC15-2, IC17-1, -2
, When READY* asserted,LAS* isset to be active.
* During data transfers,LAS* is assertedby LCLK and resetby
LDTACK*.
- LA01, LDS0-1*, LLWORD*
* IC16-2, -3,-4,IC18-I, -2,IC30-I, -2,-3,IC33-1
* When LW/B* ishigh (longword mode), LDS0*, LDSI*, LA01,
and LLWORD* are set to low during data transfers.
* When LW/B* islow (byte mode), LLWORD* is set to high
and other signalsrespond as follows:
LDS0* = QA00, LD51* =-QA00, LA01 ---QA01
* Data Bus Buffer Control
- IC17-3,-4, IC18-4,-5
- Long Word Mode (LW/B* ishigh)
* During D HBD* is active, ENBL* is asserted and ENBB* is
de-asserted.
- Byte Mode (LW/B* islow)
* During DHBD* is active, ENBB* is asserted and ENBL* is
de-asserted.
• Bus Release Control
- IC31-1
- Support Release On Request (ROR) operation.
• Bus request signals (BR0-3*) will assert BREL to release
BBSY* at the end of the current data transfer.
L.
r
m q
m
r
m
2.6 Input Channel Selector and Data Bus Buffers
• Input Channel Selector
- IC10-5, -6, IC19, IC20, IC21, IC22
- hnplement 32-to-8 data selectors using four 4-bit data selectors.
- Data selection is controlled by the two select inputs (SCLT0-1*)
as shown in Table 2.5.
• Data Bus Buffers
- Long Word Mode
* IC23, IC24, IC25, IC26
. Four transparent D-latches (74AS573) interface 32-bit input
data with the 32-blt VME data bus (D00-31).
* When I.AS* is taken low, the outputs are latched to retain
the data that was set up. Refer to Table 2.6.
. ENBL* places the 32-bit outputs in either a normal logic state
or a high-lmpedance state.
- Byte Mode
* IC27, IC32
. Two transparent D-latches (74AS573) interface 8-bit local
data bus (LD0-7) with the 16-bit VME data bus (D00-15).
* When LAS* is taken low, the outputs are latched to retain
the data that was set up. Refer to Table 2.6.
* ENBB* places the 16-bit outputs in either a normal logic state
or a high-impedance state.
iTable 2.5 Input Channel Selection
k,.
L_
l
r _
m
SLCT0* SLCTI* LD7 LD6 LD5 LD4 LD3 LD2 LD1 LD0
high
high
low
low
high
low
high
low
28 24 20 16 12 08 04 00
29 25 21 17 13 09 05 01
30 26 22 18 14 10 06 02
31 27 23 19 i5 11 07 03
Table 2.6 (a) Active Portions of Data Bus
DSI* DS0* A01
low low low low
high low high high
low high high high
high low low high
low high low high
LWORD* D24-31 D16-23 D08-15 D00-07
byte 0 byte 1 byte 2
byte 2
byte 0
byte 3
byte 3
byte 1
m
i
Table 2.6 (b) Data Organization in Memory
Operand Byte Address
byte 0
byte 1
byte 2
byte 3
$XXX .... XXO0
SXXX .... XXO1
SXXX .... XX10
$XXX .... XXII
= =
m
u
w
= ,
r
F_
i ,
J
!
W
__=!
! w
L
L
r
[]
L
2.7 VMEbus Master Control
• Master Bus Controller
- IC28, IC29
- VME 12201 provides two device chip set for non-slot I master bus
controller.
- Initiatinga Bus Request
• Drive BR0* low after receiving DWB* and LAS* asserted.
- Arbitration
• After receiving BG01N* from daisy chained VMEbus grants,
locM arbiter arbitrates between DWB* and BG01N.
. If DWB* wins the arbitration (i.e. DWB* occurs before
BG01N*), BBSY* will be asserted.
• If BG01N* wins, local arbiter will drive BGOOUT*, which
passes the bus grant down the daisy chain to adjacent
master in the system.
- Data Transfer
• Local master does not access the bus until the previous mas-
ter has relinquished control of bus, which occurs when AS*,
DTACK* and BERR* are de-asserted.
• Support Address Pipelining using DHBA* and DHBD*.
• Broadcast the address of the next bus cycle while the data
transfer of the current cycle is occuring, i.e. DTACK* and
DSn* are still low.
. DHBA* is enabled as soon as AS* is disabled.
When DTACK* goes high, signifying the end of the current
data cycle, DHBD* enables the data buffers for the next
data cycle.
• WRITE* is latched during address pipelining to hold its level.
- Bus Release
• Supports Release On Request (ROR) protocol via BREL.
. Release the data transfer bus whenever another module
requires it.
External bus request will assert BREL to release BBSY*
at the end of the current data transfer. Refer to section
2.5.
• If no bus requests are pending, the BREI. will be kept
de-_serted and the local master maintains BBSY* low to
perform continuous VMEbus data transfer cycles.
tPLX Technology, 625 Clyde Ave., Mountain View, CA 94043
mm
L
3. Interface Signals
3.1 VMEbus Interface
This section provides information on VMEbus interface. Table 3.1 and
Table 3.2 list P1/J1 and P2/J2 pin assignments respectively. The P1 connec-
tor includes all the signals required for the 68000. The P2 connector provides
expansion of both address and data buses to 32 bits and also provides 96 pins
for user I/O lines.
The data transfer bus is very similar to the 68000's native buses except
the following signals. Long word (LWORD*) is asserted for 32-bit data trans-
fers. The 6-bit address modifier (AM0 - AM5) allows the type of access to
be specified. The bus error signal (BERR*) is typically used to indicate a
memory error.
The interrupt bus has seven interrupt request lines (IRQi*), an interrupt
acknowledge (lACK*), and a daisy-chained priority signal (IACKIN*, lACK-
OUT*). Each of seven lines corresponds to an interrupt priority level.
The arbitration bus provides four levels of arbitration. For each level,
there is a bus request signal (BRi*) and a bus grant daisy chain (BGilfl*,
BGiOUT*). Th.e utility bus consists of SYSCLK, SYSRESET*, SYSFAIL*,
ACFAIL*, and power supplies.
wrim=#
w
u
m
w
i
z =
J
i
=
atom
d
fmJ
w
Table 3.1 VMEbus P1/J1 Pin Assignments
PIN No. P1/J1 ROW A P1/J1 ROW B P1/J1 ROW C
D00 ......1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
D01
D02
D03
D04
D05
D06
D07
BBSY*
BCLR*
ACFAIL*
BGOIN*
BGOOUT*
BGIIN*
BG1OUT*
BG2IN*
D08
D09
D10
Dll
D12
D13
D14
D15
GND
SYSCLK
• GND
DSI*
DS0*
WRITE*
GND
DTACK*
GND
AS*
GND
IACK*
IACKIN*
IACKOUT*
AM4
A07
A06
A05
A04
A03
A02
A01
-12VDC
+5VDC
BG2OUT*
BG3IN*
BG3OUT*
BR0*
BRI*
BR2*
BR3*
AM0
AM1
AM2
AM3
GND
SERCLK
SERDAT*
GND
IRQ7*
IRQ6*
IRQ5*
IRQ4*
IRQ3*
IRQ2*
IRQI*
+5VSTDBY
+5VDC
GND
SYSFAIL*
BERR*
SYSRESET*
LWORD*
AM5
A23
A22
A21"
A20
AI9
AI8
AI7
AI6
AI5
AI4
A13
A12
All
A10
A09
A08
+12VDC
+5VDC
m_
w
u
= =i
r
PIN No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Table 3.2 VMEbus P2/J2 Pin Assignments
P2/J2 ROW A
READY*
SLCT0*
ENB0*
P2/J2 ROW B
+5VDC
GND
RESERVED
A24
A25
A26
A27
A28
A29
A30
A31
GND
+5VDC
D16
D17
D18
D19
D20
D21
D22
D23
GND
D24
D25
D26
D27
D28
D29
D30
D31
GND
+SVDC
P2/J2 ROW C
LW/B*
SLCTI*
ENBI*
s...a
3.2 Input Channels
The input channels consit of data channels (DATA00-31), clock (CLK),
and trigger signal (TRIG*). Table 3.3 shows the pin assignments of the input
channels.
Table 3.3 Input Channel Pin Assignments
PIN DAM Signal ECB Signal
(a)
(c)
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
DATA04
DATA05
DATA06
DATA07
DATA08
DATA09
DATA10
D AT A 11
DATA12
DATA13
DATA00
DATA31
DATA30
DATA28
DATA27
DATA26
DATA25
DATA24
DATA22
DATA23
DATA21
PIN DAM Signal
(b) GND
(d) GND
D04 2 DATA03
D05 4 DATA02
D06 6 CLK
D07 8 DATA 14
D08 10 DATA 15
D09 12 TRIG*
D10 14 DATA01
Dll 16
D12 18
D13 20
D00 22
A15 24 DATA16
A 14 26 DATA29
A12 28
Ali 30
A10 32
A09 34 DATA17
A08 36 DATA18
A06 38 DATA19
A07 40 DATA20
A05 42
8M-CLK 44
1M-CLK 46
ECB Signal
GND
GND
D03
D02
4M-CLK
D14
D15
FIEN .1
D01
E
AS*-
UDS*
LDS*
R/W*
A13
FC2
FCI
FC0
A01
A02
'A03
A04
DTACK*
6800IRQ*
VMA*
1FIEN*: Fault Injection Enable, a signal transferred from the fault injection module.
Appendix A Schematic Diagrams
w
A.1 Clock Control
A.2 Address Generator
A.8 Address Bus Buffers and Address Modifier Selector
A.4 Data Transfer Control
A.5 Input Channel Selector and Data Bus Buffers
A.6 VMEbus Master Control
N
m
E ,
D
m
mw
m
i
b
L.
..,,I I.Ll ILl
!
 iIII
,...I
Z
tl-
I-- "'
e,o
e_
t,'N
i
C2r
I
C',/
2_-:.-
...
:- ,_1=
w
/
wlD"
i
t?
u
41-
Z
Ill
I)
!
O
C)
_ J
J
_b
<---c I-
a.
I
f--
t4
_ _-J
>
D.
I
I ,
O
¢O <_.j _
<--
I-
O
o _--,
<_
J
T
._.._,i _¢)
O _-J<
m _
i_ <
m _
m _
-I
--'4
_0 ..j
Z U
I.U -._
°LI
:>
O.
J
l---
(,9 LI
>
>
0_
(J
E:)..__
O
O3
.J
<
o. !--
O
" b-.
.J
<
o. _-
i
i
i-<i
E Q _
O
......_/
3--
('M
I
E_
_D
m w
O
.If
..a
L)
-- 3
L_
ea_
l;
DHBA* _ =
I
QA02-23 I_ ,
LA01. E:>-_ [
]
LLWORD* i>----._:
Vjl _ G
J2-2 _.. '---
•.-00--
,--00--
--.o o- Ri-
---00--
--00.-- -
•-00-- :
RN
QA23
O,AI6
Q_,I5
QA8
QA7
QA2
IC1]
V
ck39 3Q AMO
I_ 2
3
4
I "_
ID O -- A23Jc]2
L_. e_" -_1 i A8 :
ID C
IC14
,.--1
I
q
lED
-_ A7
--_
"'X
--'X
--xA]
L
-(_) AM0-5
LWORD*
A01-23
4o,
llli
m
el.:
i
u
J
!
I
i
LCLR*
R2
IC17-1
I
LDTACK*
LCLK
ENB0 I::=
QA01 [:>
QAO0
LW/B*
IC33-1
DHBD*
BR0*
BRI*
BR2*
BR3*
{::>
1018-s
[:::>
l:::>
I:::>
I:::>
I:::>
I:::>
I:::>
D>
ENB1
DWB*
LAS*
LA01
LDSO*
LDSI*
ENBL*
ENBB*
LLWORD*
BREL
Lu
_--_-
0
.._J.J) ) ) ) )
rc r
• /
--t ° g
o o
z
I
5
6: I < ,_
ii,_ L
i
i
w
IC28
BREL E>
SYSRESET*_
BG,N*(__:IC18.6
LAS* _
I
DWB* I:::> ,,
LDS0-1*
DTACK*
BERR* C_
BGIN
AS*
VME1220A
G
BBSY*
DHBA*
DWB*
LAS*
R/W*
VME1220B
IC29
BR*
B(;OUT*
BBSY*
i::> DHBA*
"--'xJ
CZ_
AS*
I:::> LDTACK*
I:::> LBERR*
WRITE*
[:::> DHBD*
C_ DSO-I*
D=
r:;=
,-;2
Appendix B Parts List
Table B.1 DAM Parts List (1)
m
u
U
LABEL Part Number Pins DESCRIPTION
ICI
IC2
IC3
IC4
IC5
IC6
IC7
IC8
IC9
IClO
IC11
IC12
IC13
IC14
IC15
IC16
IC17
IC18
IC19
74LS132 14
74LSI61A 16
74AS74 14
74LS161A 16
74LSI61A 16
74LSI61A 16
74LSI61A 16
74LSI61A 16
74LSI61A 16
Quadruple Schmitt NAND gates
Synchronous 4-bit counter
Dual D-type F/Fs
Synchronous 4-blt counter
74LS04 14
74AS573 20
74AS573 20
74AS573 20
74AS573 20
74AS74 14
74AS02 14
74AS00 14
74AS04 14
74LS153 16
Hex inverters
Octal D-type transparent latches
Dual D-type F/Fs
Quadruple 2-input NOR gates
Quadruple 2-input NAND gates
Hex inverters
Dual 4-to-I data selectors
H
U
r
=_
LA
B i
= t .
w
b
m
Table B.2 DAM Parts List (2)
LABEL Part Number Pins DESCRIPTION
Dual 4-to-1 data selectorsIC20
IC21
IC22
IC23
IC24
IC25
IC26
IC27
IC28
IC29
IC30
IC31
IC32
IC33
74LS153
74LS153
74LS153
74AS573
74AS573
74AS573
74AS573
74AS573
VME1220A
VMEI220B
16
16
16
20
20
20
20
20
24
24
Octal D-type transparent latches
VMEbus master controller
(Non-slot 1, P-45)
74AS02
74LS20
74AS573
74AS00
14 Quadruple 2-input NOR gates
14 Dual 4-input NAND gates
20 Octal D-type transparent latches
14 Quadruple 2-input NAND gates
m
_w
um
Appendix C DAM Board Layout
C.1 Component Side Layout
C.2 Wiring Side Layout
L
U
n
w
m
w
h
m
w_
, . )
. .)
O,_IGIN4_ PAGE I_
OF _ qiL4Lri.Y
m: :
[
L_
w
r--
N
m
Appendix D Copies of Data Sheets
D.1 VME 1220 Non-Slot 1 VMEbus Mast, er Controller
W
m
m
• B g N N 0 _. O U T_
--June 1990
VME 1210/1220
Slot I and Non-Slot 1 VMEbus Master Controllers
_-Dlstlnctlve Feature_
• VME 1210 provides two device Chip set for slot 1
master bus controller and single level arbiter
_-_, VME 1220 provides two device chip set for non-slot 1
master bus controller
• Integrates 48ma and 64ma VMEbus slg-
... nals:AS*,DS0*,DSI*,WRITE*,BR*,BBSY •
• Integrates Input hysteresis buffers
, Supports Release When Done (RWD) and Release On
Request (ROR) protocols
• Supports address plpelinlng, block transfers, and
_ early BBSY ° release
-=_- Available In Commercial, Industrial and Military tem-
perature ranges
LProgrammable Version Available
ll the VME 1210/1220 does not match the requirements
of the design, a programmable version is available (the
PLX 464) which allows the user to customize all inputs,
outputs and logic. Programming is performed using
industry standard tools such as ABEL=' and CUPL ='
software and commonly available PLD programming
hardware. Contact PLX for a data sheet on the PLX 464
and other PLX products.
h
W
l
1
--,=
m
Vcc
BREL I
LA$= )
SYSRESETB
Applications
VMEbus masters residing in slot 1 boards (VME 1210)
VMEbus masters residing in non-slot 1boards (VME 1220)
General Description
The VME 1210: The VME 1210 is comprised of the VME
1210A and the VME 1210B for slot 1 applications. The
devices are CMOS an0 Dacka0ed in 24 pin 300 milwide DIPs
or 28 pin J-type LCCs VME 1210A provides bus
requesting, local arbitration, and single level system arbi-
tration. The VME 1210B functions as the VM Ebus controller.
The requester initiates a VMEbus request from the local
master's bus request for a clata or interrupt cycle. The bus
controller controls the bus after initiation of a bus cycle and
relinquishes the bus at the end of the bus cycle. The bus
controller supervises the handshaking between the local
master CPU and the slave modules.
The VME 1220: The VME 1220 is comprised of the VME
1220A and the VME 1220B for non-slot 1 applications. The
devices are CMOS and packaged in 24 pin 300 milwide DIPs
or 28 pin J-type LCCs. e VME 1220A provides bus
requesting and local arbitration. The VME 1220B functions
as the VMEbus controller. The requester initiates a VMEbus
request from the local master's bus request for a data or
interrupt cycle. The bus controller controls the bus after
initiation of a bus cycle and relinquishes the bus at the end
of the bus cycle. The bus controller supervises the hand-
shaking between the local master CPU and the slave
modules.
_V¢¢ Vcc _ _ VCV_i.(_D
=. VME 1810 ; .
HSYm LD:_0m I= I ll LDT_KI
=;.JT. Slo't: I L0S_.... • - •
Dr), ,_ ,-=:=, =
AS. _HllA= P10 S _ e r" IERR. ViZIT[=
IIGINd :P m Vss I_HllAm I= Y 18 VSI
NC I I1' IG L&SI 8 17 Coa_ect to l_n 1]
NC • 16 NC R/V= • I*, LIIERRm
NC IS I=, NC Bg'C'Ylg II II_ _|!
BGIN . 14 Co_.lect to l_n ]3 CoN'_ect to p_ 17 t'-- n _ Correct to Vss
Vss Connect to I_ ]4 Vss D$0=
VME 1210A VME 12lOB
Vc¢ I= h "_J t,l_l v¢¢
BRCL r'- I e
SYSRESCT.
9VD= r"
AS- r-,, ==....._
I:onnect to I_n ]71=:"
COnnect to pm 16r-_.
_=I- =
IGIN r" I n
VSS I"Iw _:
=.. VME 1 20 v.DVB= _ = e=¢:::_ ASu
= .s- Non-s[o± 1 L_S0-'-, "== L=T*C_,
= )GOUT= LD$|m I" ' _ DHBD=
__Vss DTACKm s m
_'" _s±er , RY.---=, '=" ,,_ v..VRITEm
vss _.IB&. I:= _' ==_¢=1 Vss
_ CONneCt to pm 7 LAS= m 17,:=IConnect "¢o p_,, 1!
:2 Connect to p_ 8 R/V= Pr- _ sl=l L|ERR=
"='"==. ==.,.t to i>_ ',3 Co_,ect to pin 17 n _¢_1 CoN',,ec't "to vss
•"z CoNnect io _ 14 Vss I_ e ol_l DSO.
..°_--- VME 1220A
_)_BELis a IrademarkofDataI/OCorp.
CUPLis aVademarkof LogicalDevices,Inc.
VME 1220B
Figure 1. Pinout of VME 1210/1220 (DIPs)
PLX Techndogy, Inc. 1,069
PR6C_ PAGE BLANK NOT FILMED
VME 1210/1220
;n Description
VME 1220A
Pin # Pin #
LCC DIP
II = I
3 2
4 3
5 4
6 5
7 6
9 7
10 8
11 9
12 i0
13 11
Signal
BREL
LAS °
SYSRESET"
DWB"
AS*
NC
NC
BGIN
14,21,
24
16
17
_ 18
20
23
25
26
27
i 2.28
1,8,
_ 15,22
12,18, VSS
20
13
14
15
16
17
19
21
22
23
1,24
NC
o
DHBA"
BGOUT"
BBSY*
BR"
Vcc
NC
T_,pe
I
I
I
I
I
I
I
I
I
I
0
I
0
0
0
0
0
I/0
0
Active high;
released.
Function
Bus release signal indicating BBSY" can be
Active low; Address strobe from local master.
Active low; VMEbus System Reset.
Active low; Device wants bus, local master requests con-
trol of bus.
!Active low; VMEbus Address Strobe'i'
IConnect to pin 17 (DIP) or pin 20 (LCC).
i i
Connect to pin 16 (DIP) or pin 19 (LCC).
No Connect.
No Connect.
Active high; Inverted VMEbus Bus Grant In signal,
BGIN o.
Chip Ground.
Connect to Pin 14 (DIP) or Pin 17 (LCC).
Connect to Pin 13 (DIP) or Pin 16 (LCC).
No Connect.
Connect to pin 8 (DIP) or pin 10 (LCC).
Connect to pin 7 (DIP) or pin 9 (LCC).
Active low; Device has bus address, address buffer
enable.
Active low; VMEbus Bus Grant Out signal.
Active low, 48 mA open collector; VMEbus Bus Busy
signal.
i ii ,, i
Active low, 48 mA open collector; VMEbus Bus Request
signal.
+5 V Chip Power
No Connect.
_4
_iTv*N_ PAGE BLANK NOT FILMED
VME1210/1220
• Pin Description
VME 1210B and VME1220B
Pin #
LCC
3
4
5
6
7
9
10
11
12
13
14,21,
24
16
17
18
19
20
23
25
26
Pin #
DIP
3
4
5
S!_Inal
DWB"
LOS0*
LDSI*
i i
DTACK*
6 BERR"
,i
7 DHBA°
8 LAS"
9 R/W*
10 BBSY°
11
12,18,
20
,, ,....i,
Vss
13 DS0*
14
15 DSI*
16 LBERR °
17
19 WRITE*
21 DHBD"
0
I
O
O
O
O
O
O
Function
III I
Active low; Device wants bus, local master wants control
of VMEbus.
Active low; Lower data strobe from local master.
Active low; Upper data strobe from local master.
Active low; VMEbus Data Transfer Acknowledge, data is
valid during a read cycle or data has been accepted from
the bus during a write cycle.
Active low; VMEbus Errorsignal.
Active low; Device has bus address, address buffer
enable.
Active low; Address strobe from local mas_er.
Active higMow; Read or write cycle from local master.
Active low; VMEbus Busy, local master controls bus.
Connect to pin 17 (DIP) or pin ;_0(LCC).
Chip Ground.
.|r
Active low; 64ma VMEbus lower Data Strobe signal, indi-
cates valid data on bus.
Connect to Vss.
Active low; 64ma VMEbus upper Data Siro'be signal,
indicates valid data on bus.
Active low; Open collector signal, bus error to local mas-
ter.
Connect to pin 11 (DIP) or pin 13 (LCC).
Active low; 48ma VMEbus Write signal, indicates bus
read or write cycle.
Active low; Device has bus data, data buffer enable.
22 LDTACK* Active low; Open collector signal, data acknowledge to
local master.
27 23 AS* O Active low; 64mA VMEbus Address Strobe signal, indi-
cates valid address on bus.
2,28 1,24 Vcc +5 V Chip Power
NC No Connect.
L--
W
H
!
m
U
5
VME 1210/1220
= . VME 121011220 Timing Waveforms
h
U
Imm
w
w
BBSY_
BGIN
BREL
DHBA_
DHBD_
LAS_
AS_
LDSn_
DSn_
1:4
t6
't17
DTACK_
\ I
BGOUT_
DX_B_ No DWB_
Figure 5. Timing Diagram
=
mare
X :
b PR6C_,liOif'_ FAGE BLANK NOT FILMED
ME'1210/1220
]mln= Speclhcatlon_
Timing
°Parameters
"- tl
Signals
DwE" to BR"asserted
C-45
II
9C
0
45
Max. -, ime(ns) unless
otherwise specified
M-6_
to 13o
o
65
Description
If DWe" is assertedatte, 1.AS"
LIC" to BR"asserted
BR" to BGasserted
12
| t3
L'
-- t4 BGINto BBSY"asserted
t5 BBSY"to BR° negated
IS BBSY"to DHBA*asserted
BBSY"to BGINnegated
If LAS' is assertedaher DWB"
VME 12: 0 onlywhen internalBR" generat',c_
(5,G connectedto BGIN)
VM- 12tC onlywhen extema_BR"received
(BG connectedto BGIN)
t7
System arbitertime Systemarbil¢ Time VME 1220 only
125 185 VME 1210 only, in_udes delay line:55nsIor
M-65, 45ns for M-5r. 35ns for C-4S, 40nslot
135
45
45
i : t8 DHBA" to DHBD* asserted
19 DHBA" to WRITE" asserted
tt0 DHBA" to AS"asserted
tll
t12
-r
t13
m
t14
t15
.. tle
t17
: _ t18
I;9
_ t22
im
t23
._ t24
w ,
_5
_6
I"
Note:
AS* to DSn" asserted
BGIN to BBSY" negated
BRELto BBSY" negated
DTACK" to LDTACK"asserted
LDTACiCto LAS'/LDSn"negated
LAS"to DHBA* negated
DWB"to DHBA"negated
LAS"to AS. negated
LDSn"toDSn" negated
195
65
65
65
..r5min
45 max
35 min
Systemarbitertime
45
Systemarbiter time
C-35, 60ns forC-25 par1
VME 122_ only
VME 1210 only
VME 1220 only
65
45 65 Conditionalupon R/W" value
1309O
70 (rain.)
45
80 max
70 rain
135 max
105 rain
45
65
120max
10mill
195 max
165 rain
65
65
@ Localmaster
45
@ LocaJmaster
45 65
45 65
SO 72
50 72
LDSn"to DSn" negated 50 72
Dan" to WRITE" negated 45 65
45 65DSn'/DTAC)C to LDTACK"
negated
BGIN to BGOUT" asserted 130
55+d,65+d
65
195
65
9O
25+d,35+d,45+d
45
135
45
Ensures35ns minimumaddress to AS"and
data to DSn" set up times
VME 1210 only;
VME 1210 only;t7min + tl 2rain> 90 ns rain.
BgSY" assertion"
VME 1220 only
VME 1220 only. (see note below)
Validonlywhen BREL is asserted after
BGIN is nega_:l
Localmasters time to negate strobes
If DWB"alreadynegated
If I.AS"alreadynegated
Ensures lOnshold time
Eadiestnegationof DSn" or DTACK" causes
LDTACIC to be negated
VME 1220 only
VME 1210 only
Asse_on time when alreadyhave bus
(BBSY"asserted_
Assertiontime when alreadyhave bus
(BBSY"asserted)
BGIN to e3OUT" negated
Latest of LAS./DWB"to AS"
asserted
Latest of DHBD'A.DS"to DS"
asserted
BBSY"is guaranteed to be asserledfora minimum of 90 ns in theVUE 1210Adevicesar¢l the C:-4Sdeviceof.the VME2 220A, even ifB_GIN,_snegated
immediately after BBSY*is asserted. Forthe C-35 and C-25 VME 1223A devices,the sum ot the systemarbiter uu._Y- asserted to BGIN negalec-
-,- time and the tl 2 minimumtime on the VME 1220A must be greater_a-t 90 ns. Generally.this time wil! betakenup coml_ -:ely by _ systemart_.ter
time, however, if not,a Oetayline can be connectedbetween pins8 and 16 (DIP) or pins10 and 19 (LCC) on ._e VM.E.1,2/?..:,..dev..P_.to guarar=.t.ea..me
90 na minumum. Forexample,if_ systemartxter"BBSY"assertec to BGIN negated"time was 35ns {mm_, no oe|ay,ne woumDe neeoeowor_ne
- C-35VME 122OAdevice,sinoe 35 + 75 > 90. However,a 10ns dalay line wouldbe requiredforthe C-25 VME 1220A.
0
m
APPENDIX B
m_N
m
= =
m
FAULT INJECTION MODULE
SCHEMATIC DIAGRAMS
Ver. 1.0
u
L
u
ij i
T
J_
-ii
,e,v.,v_
_ww
ii
i
m
L
m
i 1 I
ue) u')
r'r' F_r"
C
0")
123
/
i f
Ft-J-T-ITT T
- ]
I1_
)))jjj) i__
,,f _-
_-- W
m
_3
C_
©
c
co
<I:
I
i
, _ '.'_. -- ,.,
•.X-z °'_ 0 _
/ A
C_l _
O_
w
&
0
<£
L 2
m
i
£]
=
D
I"',-
\
\
i :
I
w)
r _
w
°,
L9 (.9
rr _ _J
- EL
!L
_ ,_ io- /
Q. Q-
_IUF ¢¢,¢¢_
,o . I.,'l
_1 JJJJJ;JJ
u') _)
Y
ur)
I
{L
m
o
<_
I <
E-r_
_ E
¢¢¢¢¢¢¢¢
L_
JJJJJJ).)
< < --
E_
t--
i
<_ m (O
(3_ELEL
I
I___
r_
(J
I'
__I,L-- _D
L_T
(J
3
IR
m
w
-)
C:
-'1-
r_
! !
tl..
t.O
Ct.
J
13.
I i
ZZZ
1 . .
z
w
r
U
m
U
f---,
L#I
t.3
7
c-
cO
Q_
IIIIIill
rO
I
=7
,l,_c
_ °
lul,,
m
l
1
p_
Z
1
V_
m
1
l
|._
l
cO
i
Fault Injection Module
Parts List (1)
Ref No.
ICl
IC2
It3
It4-1
IC5
IC6
Part Number Size
SN74ALS520 20
SN74ALS520 20
SN74ALS138 16
SN74ALS32 14
VNE 2000 241
SN74F374 20
IC7
It8
It9-1
IClO-I
ICll
SN74LS645-1 20
NC68230 P8 48
SN74ALSO4B 14
SN74LS244 20
SN74ALS161B 16
R1
ICl2
ICl3
ICl4-1
IC14-2
IC14-3
IC14-4
IC14-5
ICl5-1
IC16-I
IC17
IC18-1
R2
DLI
SN74ALS520
SN74ALS520
SN74ALSO4B
SN74ALSO4B
SN74ALSO4B
SN74ALSO4B
SN74ALSO4B
SN74ALS02
SN74ALS01
SN74ALSI53
SN74ALS74A
RWT050P
82
20
20
14
14
14
14
14
14
14
16
14
8
14
Description
8-bit Identity Comparator
8-bit Identity Comparator
3 to 8 Decoder
Ouad 2-Input OR Gates (1/4)
Slave Nodule Interface Device
Octal D-Type Flip-Flops
Octal Bus Transceivers
Parallel Interface/Timer (PIT-O)
Hex Inverters (1/6)
Octal Buffers (I/2)
4-bit Binary Counter
R Network, seven 4.Tkn (I/7)
8-bit Identity Comparator
8-bit Identity Comparator
Hex Inverters (1/6)
Hex Inverters (2/6)
Hex Inverters (3/6)
Hex Inverters (4/6)
Hex Inverters (5/6)
Quad 2-Input NOR Gates (1/4)
Quad 2-Input NAND Gates (1/4)
Dual I of 4 Data Selectors
Dual D-Type Flip-Flops (1/2)
R Network, seven 4.7kN (2/7)
50ns Delay Line
t300mil 24 pin DIP
_Single-in-line package
i.
=
w
=
Ref No. Part Number
SN74ALSO4BIC9-2
IC9-3
ICi5-2
IC16-2
ICl9
IC20
IC21
IC22
R3
R4
R5
IC23
IC24
IC25
IC26
IC27
IC28
IC29
IC30
IC3i
IC32
IC33
IC34
IC35
IC36
IC37
IC38
SN74ALSO4B
SN74ALS02
SN74ALSOI
SN74ALSI53
SN74ALS153
SN74ALSI53
SN74ALS153
MC68230 P8
SN74LS449
SN74LS449
SN74LS449
MC68230 P8
SN74LS449
SN74LS449
SN74LS449
MC68230 P8
SN74LS449
SN74LS449
SN74LS449
NC68230 P8
SN74LS449
SN74LS449
SN74LS449
Fault Injection Module
Parts List (2)
Size Description
14
14
14
14
16
16
16
16
8
8
8
Hex Inverters (2/6)
Hex Inverters (3/6)
Quad 2-Input NOR Gates (2/4)
Quad 2-Input NAND Gates (2/4)
Dual I of 4 Data Selectors
Dual i of 4 Data Selectors
Dual I of 4 Data Selectors
Dual I of 4 Data Selectors
R Network, seven 4.Tkn (3/7)
R Network, seven 4.Tkn (4/7)
R Neteork, seven 4.Tkn (5/7)
48 Parallel Interface/Timer (PIT-I)
16 Bus Transceviers w/ Bit dir.
16 Bus Transceviers w/ Bit dir.
16 Bus Transceviers w/ Bit dir.
48 Parallel Interface/Timer (PIT-2)
16 Bus Transceviers w/ Bit dir.
16 Bus Transceviers w/ Bit dir.
16 Bus Transceviers w/ Bit dir.
48
16
16
16
48
16
16
16
Parallel Interface/Timer (PIT-3)
Bus Transceviers w/ Bit dir.
Bus Transceviers w/ Bit dir.
Bus Transceviers w/ Bit dir.
Parallel Interface/Timer (PIT-4)
Bus Transceviers w/ Bit dir.
Bus Transceviers w/ Bit dir.
Bus Transceviers w/ Bit dir.
u
2
W
r_
m
PORT (C)
v_
m
m
v
0
GBA,
I TRIG.
CONTROL
TRIG
m i i
PULSE GENERATOR
7- VCC
GAB* GBA*
AI B1
A2 B2
LS-446
A3 B3
A4 B4
DRI DR2 DR3 DR4
OPERATION
ISOLATION
A* TO B
ISOLATION
GAB, DRn
H II X
H L H
H X L
PORT (B)
BIT 1
BIT 2
BIT 3
BIT 4
D
0
FIG. FAULT INJECTION NODULE (4-BIT)
ww
Additional Components for the New Experimental System
m
m
m
i
Part No. Manufacturer Description Cost ($)
MZ 7500 MIZAR GPIB Interface Board for 695.00
VMEbus
MIZAR Single Cable for MZ 7.500 75.00
MacII488 IOtech GPIB Controller Board for 535.00
Mac II
PFG5105 Tektronix Pulse Generator (demo) 2,471.25
PFG5105 Tektronix Pulse Generator (new) 2,800.75
TM5006 Tektronix Prog. Mainframe (demo) 851.25
FIM JHU 48ch Fault Injector
Mac II Apple Macintosh II
SPARC Sun Micro. SPARCstation work station
-I
m
W
i
w
L_
W
li
r_
H
i
wm
r_
m
I
m
W
W
U
I
M
m
R
i
+=.
SPARC
I I
I I
' SCS I '
! I
I I
•-:, I/F l-
m i
t ......... J
SCSI
I/F
(Up to 8MB)
I
I
SRAM
(2MB) ---
VMEb u s SYSTEM
(0S-9/68000)
I HDD
i (80XB)
FDC/
SCSI
FDD
(3.5")
J
POWER
(+5V, !12V)
I
CPU
(MC68030)
I
GP-IB
I/F
ITS-232 GPIB
GP-IB_I/F
Macl I
GPIB
TRG
FIM
PULSE
DAM
I
PULSE JGEN
!_S-232
VMEbus
LOCAL BUS
CUT
(IC68000)
I ITS-232
VTIO0
FAULT INJECTION EXPERIMENTAL CONFIGURATION
i
Targeted Features of the Fault Injection Module
• Fault Injector
- Provides 48 channels with bit-definable outputs using four PI/T
(MC68230) and twelve bus transceiver (74LS446) chips.
- Supports three output states (0, 1, and Z 1) on each channel.
- 2ch pulse generator is installed as a source of fault injections.
- Supports single/multiple faults of stuck-at-0/1 types with dura-
tion varying from 40 ns to 99.9 ms.
• Word Recognizer
- Provides a versatile trigger source for the fault injection and data
acquisition.
- Implements 16-bit word recognizer using a MC68230 PI/T and
two 74LS686 magnitude comparators.
t Z: High-impedance
