On-Line Monitor Design of Finite-State Machines by Gao, Feng & Hayes, John P. (John Patrick)
JOURNAL OF ELECTRONIC TESTING: Theory and Applications 19, 537–548, 2003
c© 2003 Kluwer Academic Publishers. Manufactured in The Netherlands.
On-Line Monitor Design of Finite-State Machines
FENG GAO AND JOHN P. HAYES
Advanced Computer Architecture Lab., University of Michigan, Ann Arbor, MI 48109, USA
Received July 21, 2002; Revised November 24, 2002
Editors: C. Metra and M. Sonza Reorda
Abstract. On-line monitoring is a useful technique for ensuring system reliability. By continuously supervising
the system’s operation, a wide range of problems, such as physical defects, transient faults and design errors,
can be detected. A monitor M∗’s behavior can be viewed as an abstraction of the target system M’s behavior,
and can be represented by a homomorphic mapping from M to M∗. We present a systematic procedure to select
homomorphisms for monitor design and measure their costs based on a behavioral fault model. Analysis of the
method shows that monitors with very few states and low area can provide high fault coverage. Experimental results
are presented which quantify the basic trade-off between area overhead and fault coverage. Simulation results under
the industry-standard single stuck-at fault model are also reported.
Keywords: on-line monitoring, homomorphism, finite-state machine
1. Introduction
Increasingly, integrated circuits (ICs) are being used
in safety-critical applications, from anti-lock braking
in automobiles, to fly-by-wire aircraft, to prosthetic
systems in the human body. At the same time, ICs
are becoming more complex and compact, resulting
in their increased susceptibility to transient or in-
termittent faults. Such faults affect the behavior of
the circuits from time to time and must be detected
quickly, suggesting a need for efficient on-line er-
ror monitoring. An on-line monitor continuously su-
pervises the operation of a circuit and reports illegal
behavior.
On-line monitoring strategies can be applied at var-
ious levels of abstraction, ranging from the gate and
register-transfer levels to the system level. Run-time
errors can be caught if we can check whether the cur-
rent outputs are valid. In such approaches, the outputs
of a circuit are usually encoded with an error-detecting
code, and a monitor detects the occurrence of non-
code outputs. A straightforward approach [18] is to ap-
pend check bits to the normal output bits. The resulting
designs consist of functional logic, a check-bit gener-
ator, and a checker. The functional logic generates the
normal outputs, the check-bit generator generates the
check bits, and the checker determines if the outputs
are valid codewords. Various encoding methods, for in-
stance, group parity codes [20] and Berger codes [14],
can be used to encode the outputs. A basic disadvan-
tage of these methods is that a single fault inside the
circuit may change the outputs to other codewords and
so be undetectable.
We can also embed self-testing structures in the tar-
get design. Tests for functional modules can then be ap-
plied during the cycles when the modules are inactive.
If the module functions are simple and the free cycles
for each functional module can be pre-determined, we
can schedule the test and normal operations at the same
time without clearly defining a boundary between these
two modes [8]. On the other hand, if the cycles during
which a module is inactive are irregular, extra control
signals may be needed to switch between the testing
and normal modes [15]. The main disadvantage of this
type of self-testing is that it is not good at detecting
transient faults.
538 Gao and Hayes
Signature analysis [12] is another widely studied ap-
proach to on-line monitoring of programmable sys-
tems. A co-processor, often called a watchdog or
diagnostic processor, whose duties include signature
computation and checking, is usually employed. Error
detection by means of a watchdog is a two-phase pro-
cess. In the first phase, signatures are computed and
assigned to basic blocks of the target program, where
a basic block is a set of instructions with no jumps al-
lowed from or into these instructions. During the sec-
ond phase, the monitor computes run-time signatures
and compares them with the precomputed signatures
received from the target system. Differences between
these two sets of signatures reveal permanent or tran-
sient errors. Besides their limited ability to detect data-
related errors, such methods also cause performance
degradation by increasing the number of instructions
in a program.
In this paper, the system to be monitored M is mod-
eled as a finite-state machine (FSM), for which a simple
behavior fault model is defined. A monitor M∗ is then
abstracted systematically from the behavioral descrip-
tion of M using homomorphisms selected under the
guidance of appropriate cost measures. The effects of
the number of states in the monitor on its area over-
head and fault coverage are analyzed. It is shown that
one-state monitors, i.e. combinational monitors, have
small area overhead. Moreover, for FSMs with high
output/state ratios, such monitors also have very high
fault coverage under our behavioral fault model. Sim-
ulations under the standard single stuck-at fault model
are also performed.
The remainder of this paper is organized as follows.
After the introduction of some notation and our system
model in the next section, we propose an algorithm for
FSM behavior abstraction based on homomorphisms











































 (a)  (b)  (c)
Fig. 1. (a) An example FSM M , (b) a next-state fault of M , and (c) an output fault of M .
and fault coverage of the monitors vary with respect
to the number of states in the monitors. Experimental
results are presented in Section 5, while some conclu-
sions are discussed in Section 6.
2. System Model
A finite-state machine (FSM ) M is defined as a 6-
tuple {I, S, δ, O, λ, S0}, where I is the set of inputs,
S is the set of states, δ: I × S → 2S is the state tran-
sition function, S is the set of outputs, λ: I × S → 2O
is the output function, and S0 ⊆ S is the set of initial
states [11]. An FSM is deterministic if δ and λ are de-
terministic, that is, no input can map the machine to
two or more outputs or next states at the same time. An
FSM can be described using a state transition graph
(STG). The STG of M is denoted by G M (VM , EM ),
where VM is the set of nodes, representing the state set
of M and EM = {〈u, v〉, u, v ∈ VM} is the set of edges,
representing the transition set of M . Each edge 〈u, v〉 is
labeled with a set of input-output pairs, each of which
denotes an input that triggers the state transition and
the corresponding output. The number of edges that
start from or end at a state u ∈ VM is called the degree
of u. The maximum degree of all the states is called
the degree of the STG. For convenience, the degree of
the STG is also referred to as the degree of the corre-
sponding FSM. Fig. 1(a) shows a state transition graph
representing a seven-state FSM. The degrees of state
A, D, and F are two, while that of B is four, which is
the maximum degree of all these states. Therefore, the
degree of the STG is four.
Let M = {I, S, δ, O, λ, S0} be the specification of
an FSM, and M ′ = {I, S, δ′, O, λ′, S0} be its imple-
mentation. M ′ has a next-state fault on receiving input
i in state s if δ(s, i) = δ′(s, i), where s ∈ S and i ∈ I .
On-Line Monitor Design of Finite-State Machines 539






A  E  
FB  










01,11 01 11 
E  
F
C  G  
D  
(b)
A B  
Fig. 2. (a) Base monitor M∗0 ; (b) monitor M
∗
1 formed by applying a state homomorphism to M
∗
0 .
M ′ has an output fault on receiving input i in state s
if λ(s, i) = λ′(s, i), where s ∈ S and i ∈ I . We refer
to next-state faults and output faults as behavior faults.
Two versions of M with behavior faults are given in
Fig. 1(b) and (c). The dotted rectangles highlight a next-
state fault and an output fault of M . For simplicity, we
assume that an FSM has at most one next-state or one
output fault at any time. We call this the single behavior
fault (SBF) model.
With the concepts defined above, we now in-
troduce our overall system model. Suppose M =
{I, S, δ, O, λ, S0} is the FSM to be monitored. We
first define a base monitor M∗0 = {I × O, S, δ∗0 ,
{0, 1}, λ∗0, S0}, which is able to detect all the SBFs of
M , where (1) δ∗0 (ab, s) = δ(a, s), for a ∈ I, b ∈ O ,
and (2) s ∈ S and λ∗0(ab, s) = 0 if λ(a, s) = b; other-
wise, the output is 1 indicating an SBF. For example,
the base monitor of the FSM M in Fig. 1(a) is shown
in Fig. 2(a). For brevity, only those transitions with out-
put 0 are shown, while those with output 1 are omitted.
Therefore, the indicated transitions correspond to all
valid transitions in M .
We next perform behavior abstraction on the base
monitor using homomorphisms, which provides the
theoretical foundation for our approach. A many-to-


















Fig. 3. On-line monitoring system: (a) overall structure; (b) monitor design.
phism from M1 = {I1, S1, δ1, O1, λ1, S01} to M2 =
{I2, S2, δ2, O2, λ2, S02} if and only if φI (i1) ∈ I2 for
i1 ∈ I1, φS(s1) ∈ S2 for s1 ∈ S1, φO (o1) ∈ O2
for o1 ∈ O1, φS(δ1(i1, s1)) = δ2(φI (i1), φS(s1)) and
φO (λ1(i1, s1) = λ2(φI (i1), φS(s1)) for i1 ∈ I1, o1 ∈
O1. M2 is the homomorphic image of M1. Fig. 2(b)
shows a homomorphic image of the FSM M∗0 in
Fig. 2(a). This particular homomorphism maps states A
and B of M∗0 into one state in its image, while all other
states, inputs and outputs are mapped to themselves.
Intuitively, a homomorphism reduces the number of
states and preserves the mappings of the next-state and
output functions; hence it is a precise form of behavior
abstraction.
The overall monitoring scheme is illustrated in
Fig. 3(a). The monitor M∗ operates in lockstep with
M , it can mimic the state transitions of M and report
SBFs with just a one-cycle delay. If the monitor is the
base monitor M∗0 , all possible SBFs can be detected.
If lower fault coverage is allowed, we can perform ho-
momorphisms on M∗0 to obtain a smaller, and therefore
more economical, monitor M∗. As shown in Fig. 3(b),
M∗ is composed of two parts: interface logic which
implements the homomorphisms from M to M∗, and a
compact FSM Mc which approximates the state transi-
tions of M .
540 Gao and Hayes
3. Abstraction Algorithm
To reduce the monitor area, we apply homomorphisms
to M∗0 in two stages called the state and input stages.
In the state homomorphism stage, we gradually reduce
the number of states in the monitor. The fault coverage
hence decreases due to the limited knowledge available
concerning target system states. While in the input ho-
momorphism stage, we only perform a simple trans-
formation, which does not hurt the fault coverage. We
do not consider output homomorphisms, since the base
monitor has only a single output. Note that the inputs
of M∗0 incorporate both the inputs and outputs of the
original machine M .
The state homomorphism stage involves construct-
ing a series of homomorphisms of the form: M∗0 →
M∗1 → M∗2 → · · · → M∗n = M∗. Each homomor-
phism H : M∗i → M∗i+1(0 < i < n) maps two states of
M∗i to one state of M
∗
i+1. Such mappings are referred
to as simple state homomorphisms. While H reduces
the number of states in the monitors, the number of de-
tectable SBFs also decreases because of the state merg-
ing. Homomorphisms that lead to as few undetectable
SBFs as possible are desirable, because our goal is to
obtain monitors with small area, capable of detecting
as many SBFs as possible. Furthermore, the number of
available simple state homomorphisms is proportional
to the square of the number of states in the monitor.
Therefore, to guide the selection of simple state homo-
morphisms, we define a homomorphism cost function
C(H ) that measures the homomorphism’s impact on
fault coverage.
For a homomorphism H : M∗i → M∗i+1, C(H ) is the
number of SBFs that become undetectable if we replace
Fig. 4. The three types of SBFs and the corresponding homomorphism costs.
M∗i with M
∗
i+1 as the monitor. In fact, those faults can
readily be enumerated. To facilitate the calculation, we
classify them into three types, which are defined next
via an example, along with a calculation method. We
denote the inputs that trigger transitions from state s1 to
state s2 by tr input(s1, s2), and the states in M∗0 that are
mapped to s by pre-image(s). Let |T | be the cardinality
of set T . Using the base monitor and its homomorphic
image in Fig. 2 as an example, the base monitor M∗1
in Fig. 2(b) is obtained by applying a simple state ho-
momorphism H to M∗0 , which maps states A and B
of M∗0 into one state AB of M
∗
1 . The three SBF types
introduced by H are as follows.
• Next-state-change faults. If the next state of a tran-
sition changes erroneously from B to A, this fault
becomes undetectable when we replace M∗0 with M
∗
1
as the monitor. For example, the transition from E
to A on receiving input 01 or 10 is detectable in M∗0 .
This fault is, however, undetectable using M∗1 . The
transition from G to B on input 11 also belongs to
this type. The associated cost C1(H ) is defined as:
C1(H ) = |pre-image(E)| ∗ |pre-image(A)| ∗ 2
+ |pre-image(G)| ∗ |pre-image(B)| ∗ 1
The constant coefficients 2 and 1 are the cardinalities
of tr input(E, B) = {01, 10} and tr input(G, A) =
{11}. In general, if states s1 and s2 are to be merged
and some inputs trigger the transition from a state s
to s1 or s2, each input corresponds to a next-state-
change fault that will escape detection by the image
monitor. The general formula to compute the cost
C1(H ) is given in the first row of Fig. 4.
On-Line Monitor Design of Finite-State Machines 541
• Current-state-change faults. State transitions with
current state A or B can also become undetectable if
their destinations are different, forming the second
fault type. For instance, next-state fault A
00→ C is
detected by M∗0 . After mapping A and B to one state,
M∗1 is not able to determine if the transition is ini-
tiated from A or B, causing undetectable next-state
fault A
00→ C . Erroneous transitions falling into this
type also include A
00→ F and B 00,11−→ B. Similarly,
we can construct a cost function C2(H ) for this fault
type as follows:
C2(H ) = |pre-image(A)| ∗ |pre-image(C)|
+ |pre-image(A)| ∗ |pre-image(F)|
+ |pre-image(B)| ∗ |pre-image(B)| ∗ 2
The general formula for C2(H ) is shown in the sec-
ond row of Fig. 4. We can read the formula as follows.
If states s1 and s2 are to be merged and some inputs
triggers the transition from s1 or s2 to a state s, each of
the inputs introduces to a current-state-change fault
that will escape detection by the image monitor.
• Current-next-state-exchange faults. The existence of
a state transition from A to B causes undetectable
next-state faults, which change both the current and
next states of the transition. For example, erroneous
transition B
00,11−→ A, which has different current and
next states from valid transition A
00,11−→ B, becomes
undetectable in M∗1 . The cost contributed by this type
of fault is
C3(H ) = |pre-image(B)| ∗ |pre-image(A)| ∗ 2
In general, suppose that states s1 and s2 are to be
merged into a single state s, and input i causes a
transition from s1 to s2, the faulty transition from s2
to s1 on input i corresponds to a current-next-state-
exchange fault that cannot be detected by the image
monitor. The general formula of C3(H ) is shown in
the third row of Fig. 4.
The total cost C(H ) of a simple state homomorphism
H that maps state s1 and s2 into one image is the sum
C1(H ) + C2(H ) + C3(H ) of the costs specified above.
The formulas in Fig. 4 indicate the complexity of com-
puting C(H ). Given a state pair, the only computation
is a group of set differences, whose complexity is O(d),
where d is the degree of the monitoring FSM. There are
in total |S|2 state pairs, resulting in a final complexity
of O(d|S|2).
We turn next to the selection of the homomorphisms
used to construct M∗. In general, a homomorphism
H can introduce non-determinism. For example, M∗1
in Fig. 2(b) is non-deterministic because applying in-
put 00 to state AB triggers transitions to itself and to
C . Such non-determinism is easily eliminated, but it
increases the number of states. Thus we want to se-
lect homomorphisms that have both low cost C and
deterministic images. To aid this heuristic selection
process, we introduce the state dependency graph. A
state dependency graph (SDG) G = (V, E) of an FSM
M∗ = {I, S, δ, O, λ, S0} is a directed graph, where
each node ni is labeled by two states (si,1, si,2) rep-
resenting a homomorphism 	i = (φi,I , φi,S, φi,O ),
where φi,S(si,1) = φi,S(si,2) = s, and other states, in-
puts and outputs are mapped to themselves. There is
an edge (ni , n j ) from ni to n j in G if there is an in-
put t such that λ(si,1, t) = s j,1 and λ(si,2, t) = s j,2, or
else λ(si,1, t) = s j,2 and λ(si,2, t) = s j,1. Intuitively,
the edge (ni , n j ) indicates that if we apply homomor-
phism 	i , we must also apply homomorphism 	 j to
eliminate non-determinism due to 	i . In the FSM in
Fig. 2(a), the next states of states A and B on input 00
are B and C , respectively. Therefore, there is an edge
from node (A, B) to (B, C), as shown in Fig. 5. For a
state pair (si,1, si,2), we have to compare the inputs for
every state pair that si,1 and si,2 may transit to, result-
ing in d2 comparisons in the worst case. The worst-case
complexity of constructing the SDG is hence O(d2|S|2)
since we have to make the comparison for every state
pair. The SDG is similar to the implication graph used
for FSM minimization [10]. The major difference is
that in an SDG, we only consider whether two states








Fig. 5. Part of the SDG for M∗0 in Fig. 2(a).
542 Gao and Hayes
Fig. 6. Heuristic algorithm for state homomorphism
selection.
implication graph, we take both next states and outputs
into consideration.
A strongly connected component (SCC) in a directed
graph is a set of nodes in which there are paths between
any two nodes. The dotted lines in Fig. 5 indicate some
SCCs in the SDG. SCCs that do not depend on other
SCCs represent sets of mappings that should be applied
together to avoid non-deterministic images. The homo-
morphism cost of an SCC is the sum of the costs of the
simple state homomorphisms represented by the nodes
in the SCC. We apply homomorphisms represented by
nodes in SCCs with the lowest cost during the abstrac-
tion process until no further homomorphisms can be ap-
plied without violating fault coverage constraints. The
algorithm is shown in Fig. 6. The complexity of the
algorithm is dominated by SDG construction, which
occurs in every iteration. Since each iteration reduces
the number of states by at least one, the maximum
number of iterations is |S|. Therefore, the worst case
complexity of the overall algorithm is O(d2|S|3).
After the state homomorphism stage, homomor-
phisms that map the inputs to a smaller input space
can be applied. Fig. 7 shows a monitor M∗ obtained af-
ter the state homomorphism stage on base monitor M∗0
in Fig. 3 has been completed. Note that this monitor’s
S t0=A , E  






Fig. 7. Sample monitor M∗.
behavior is incompletely specified. For instance, the
transitions of state st2 on inputs 00 and 01 are not de-
fined. The reason for this is that valid inputs of the
target machine M are usually only a non-trivial sub-
set of the whole input space. We can think of this as
saying there is a dummy state serving as the next state
of the unspecified transitions, and the output of those
transitions is 1, indicating an error in M .
The input space is naturally partitioned into two
groups by the state transitions initiated from state st1
by inputs 00, 10 and inputs 11, 01. Similarly, state st2
also divides the input space into two groups based on
its transitions. The inputs that can be mapped to the
same image are those always in the same group for all
partitions. For example, inputs 01 and 11 of M∗ are in
the same group because they always trigger transitions
with the same next states and outputs for all current
states. On the other hand, inputs 00 and 01 cannot be
mapped to the same image input because they result in
the same next state when the current state is st0, but re-
sult in different next states when the current state is st1.
Therefore, instead of considering inputs pair by pair,
we form a cross product of all the partitions made by
the states. All inputs in the same partition of the cross
products can then be mapped to the same image.
In the above algorithm, state abstraction starts di-
rectly from the base monitor. One can ask whether it is
beneficial to extend the two-stage approach (state ho-
momorphism selection followed by input homomor-
phism selection) to a three-stage one, where we first
apply some input homomorphisms. For example, given
the base monitor M∗0 = {I × O, S, δ∗0 , {0, 1}, λ∗0, S0}
of the target system M = {I, S, δ, O, λ, S0}, we can
first map the input space of M∗0 to a smaller one. We
can use the input homomorphism H : I × O → I × O ′,
where O ′ is obtained by a group parity checking code.
That is, we divide the signals of O into groups, com-
pute an even (or odd) parity check bit for each group,
and form O ′ with these parity check bits. Thus, if 1101
is an output of M , and is viewed as two groups 11 and
01, then parity bits for these two groups are 0 and 1,
respectively. Hence 01 is a parity code for output 1101.
This process is analogous to first generating an FSM
whose outputs are the group parity check bits of M’s
outputs, and then constructing a base monitor M ′0 for
the new FSM. The previous two-stage approach can
then be applied to M ′0. Note that if we construct O
′
by generating one even parity check bit for each bit of
M’s outputs, we obtain M ′0 = M∗0 . Therefore, the two-
stage approach can be viewed as a special case of the
On-Line Monitor Design of Finite-State Machines 543
three-stage one, where an identity input homomor-
phism is applied. The area of M ′0 is almost certainly
smaller than that of M∗0 if the input homomorphism is
nontrivial, because it has fewer input signals. Starting
with a smaller monitor for state abstraction, we may ob-
tain a smaller final monitor that meets some bounds on
fault coverage. However, the number of available ho-
momorphisms with deterministic images may also de-
crease, leaving the potential benefits of this technique
an open question. Our experiments will show that it is
possible to obtain monitors with high fault coverage
and low area overhead following this approach.
4. Area/Fault Coverage Trade-Offs
The monitor design approaches proposed above trades
off fault coverage for area overhead. The area reduc-
tion primarily results from the reduction of the number
of states in the monitor. We now investigate how the
number of states in monitor M∗ affects its area over-
head and fault coverage. In Section 5, we will verify
this analysis experimentally.
As shown in Fig. 3(b), M∗ consists of the interface
logic IL and the compressed finite-state machine Mc.
Let n and l be the number of input and output bits
of IL, respectively, and S the number of states in Mc.
We use L = 2l to denote the number of outputs. As
discussed in Section 3, the transitions initiated from one
state divide the input space into disjoint sets. Since a
state may change to any other state, the state transitions
initiated from a state divide the input space into (S +
1)/2 sets. The final partition of the input space has
L = ((S + 1)/2)S sets in the worst case. In fact, the
compressed finite-state machine Mc typically has only
a few states and inputs. Therefore, the interface logic
takes the major portion of the monitor area.
Let F be an arbitrary multiple-output Boolean func-
tion representing the interface logic, and having n input
bits and l output bits. The area of IL is approximated by
the number of literals in the following equations [5].
F = a1 p1 + a2 p2 + · · · + at pt (1)
p1 = x̃11 x̃12 . . . x̃1u1 (2)
p2 = x̃21 x̃22 . . . x̃2u2 (3)
· · · · · ·
pt = x̃t1 x̃t2 . . . x̃tut (t + 1)
Here, x̃i is a literal representing an input variable or
its complement. pi is prime implicant of the multiple-
output function F and ai ∈ A = {i | 0 ≤ i < L = 2l}.
The probability that x̃i1 x̃i2 . . . x̃i(n−k) forms a prime
implicant is L−2
k
(1 − L−2k )n−k , where a k-cube is as-
signed a fixed value while all its n − k neighboring
k-cubes are not. There are altogether (nk)2
n−k possible
k-cubes and L possible assignments to a k-cube. Con-
sequently, the average number of k-cube prime impli-







1 − L−2k )n−k(L − 1)
The above expression contains L − 1 rather than L so
that the prime implicants with ai = 0 can be removed.










1 − L−2k )n−k(L − 1)
Since a k-cube has n − k literals, the average number









1 − L−2k )n−k(L − 1)(n − k)
In Eq. (1), pi occurs multiple times if the number of
1’s in ai is more than one. Since the expected number
of 1’s in ai is l/2, the expected total number of literals
for F is









1 − L−2k )n−k
× (L − 1)(n − k + l − 1)
The value of K (n, L) for several nL combinations is






































Fig. 8. The expected number of literals K (n, L) of IL as a function
of its number of outputs L when IL has n = 8, 10 and 12 inputs.
544 Gao and Hayes
Fig. 9. Experimental results for some MCNC benchmark circuits.
number of literals K (n, L) increases significantly with
the number of outputs L . Furthermore, the number of
the outputs increase exponentially with the number of
states in Mc. Therefore, the expected number of literals,
or equivalently, the monitor area decreases significantly
as the number of states in Mc decreases.
We now consider the monitor’s fault coverage. Usu-
ally, the behavior of a system is not completely spec-
ified, leaving more optimization opportunities in the
design process. The don’t cares in the specification of
the target FSM M affect the complexity of both M and
its monitor. Therefore, it may be beneficial if we con-
sider the overall area of M and its monitor when the
don’t cares are exploited. This “design-for-monitoring”
problem is beyond the scope of the present paper, how-
ever. Since our monitors are designed to detect faults in
the final FSMs whose don’t cares have been eliminated
during the design process, we focus on completely
specified FSMs. For a completely specified FSM with
u input bits, V states and w output bits, the total number
of single next-state faults is 2u × V × (V −1) since un-
der the SBF model, any incorrect next state for a given
current state and input is a fault. Similarly, the number
of single output faults is 2u × V × (2w − 1). Therefore,
the total number of SBFs is 2u × V × (2w + V − 2).
When the monitor has a single state, we cannot dis-
tinguish the faulty next state from the correct one. In ad-
dition, if an input triggers different outputs for different
current states, we cannot identify a faulty output if the
output is valid for some current states. Therefore, the
total number of undetectable faults is 2u ×V ×(2V −2)
given that 2w ≥ V in the worst case. Consequently, the
fault coverage is
1 − 2
u × V × (2V − 2)
2u × V × (V + 2w − 2)
= 2
w − V
V + 2w − 2 ≈
1 − V/2w
1 + V/2w
If a target FSM has high output/state ratio 2w/V ,
which means that V/2w is approximately 0, we can usu-
ally obtain single-state monitors with high fault cover-
age. For example, the MCNC benchmark circuit S1484
in Fig. 9 has 48 states and 19 output bits. On the other
hand, single-state monitors typically take much less
area than their multiple-state counterparts. Therefore,
for FSMs with high output/state ratios, we can often
reduce the area significantly by reducing the number
of monitor states to one with only minor fault coverage
loss.
5. Experimental Results
A set of experiments was performed on the MCNC
FSM benchmark circuits [2] to evaluate our monitor
design method and the foregoing analysis. Fig. 9 shows
the results on several representative benchmarks. The
FSMs to be monitored and their properties (number of
input bits, number of output bits, number of states and
area) are shown in columns 2 to 5, followed by the area
overhead of simple duplication. The next four columns
present the number of output check bits, inputs, states,
and area of the corresponding monitors, respectively.
The area overhead and fault coverage are shown in the
On-Line Monitor Design of Finite-State Machines 545
Fig. 10. Program run times for some larger FSMs.
last two columns, which are our main metrics of mon-
itor effectiveness. In the duplication case, we assume
that the outputs from the two copies of the target FSMs
are compared with dual-rail checker trees [19]. The
FSMs to be monitored, their monitors, and the dual-
rail checkers were all synthesized using the Synopsys
synthesis tools [17], which also reported the areas of
the FSMs and their monitors. Area is shown in Fig. 9 as
a multiple of the area of a standard two-input NAND.
The results demonstrate that monitors with a few
states and high fault coverage can be obtained. For ex-
ample, the first monitor for ex1 takes 63% of the area
of ex1, while its fault coverage is above 96%. How-
ever, the monitors with only one state, i.e., combina-
tional monitoring circuits, almost always lead to mon-
itors with smaller area overhead. For example, when
the number of states of the monitors for ex1, ex6 and
tma increases from 1 to 2, the area overheads are al-
most tripled. This trend holds for all other examples,
too. On the other hand, the fault coverage of the one-
state monitors for ex1 and s1488 is still around 99%.
Even for ex6 and tma, their one-state monitors have
fault coverage around 90%. However, the fault cover-
age of circuit tbk’s combinational monitor is as low as
15.6%. As analyzed in Section 4, the fault coverage of
the combinational monitors depends on the output/state
ratios of the target machines. The ratios for all circuits
except tbk are high, leading to one-state monitors with
high fault coverage, which confirms our earlier anal-
ysis. The results also show that the overhead of most
of the online monitors is smaller than that of duplica-
tion. In particular, the overhead of the combinational
monitors is much smaller than simple duplication.
We also applied our method on some larger ITC’99
benchmark circuits [7] to investigate the scalability of
the proposed approach. We examined the benchmarks
from b01 to b10, among which b01, b02, and b06 are ex-
cluded because they have fewer than 10 registers. Since
these circuits are given in blif format, we extracted their
FSM representations using the “stg extract -e -c” rou-
tine of SIS [16]. Circuits b08 and b09 have tens of
thousands of states, and require too much memory for
our current implementation. SIS also failed to generate
an FSM for b04 because of its huge number of states.
For the remaining circuits, we ran our program on a
SUN Blade-1000 with 512 MB of RAM and the run
times needed to obtain the combinational monitors are
reported in Fig. 10. We can see that the run times for
FSMs with tens of states are less than a second. Even
for FSMs with more than 2,000 states, the run times are
less than two hours. These data give us confidence that
processing even larger FSMs like b08 or b09 is feasible
given a more memory-efficient implementation of our
method.
In order to investigate the monitors’s fault detec-
tion capability under the standard single stuck-at fault
(SAF) model, we did another set of experiments. Since
the monitor has only one output signal, we can de-
tect an SAF inside the monitor if it forces the output
to 1; otherwise, the monitor will not report the error.
This problem can easily be solved by using dual-rail
logic, for instance, differential cascode voltage switch
(DCVS) [6] logic. DVCS has an inherent self-testing
property which can provide good coverage for stuck-at
faults [13]. Although it is also claimed to have advan-
tages over traditional static CMOS in terms of layout
area, its application is limited to small modules due to
transistor sizing and crossbar current problems. There-
fore, it is more appropriate to use it in the monitor rather
than the whole circuit. In our experiments, we only in-
ject SAFs into the target system and keep the monitor
fault-free. Because the monitors are designed to detect
transient faults, we assume that a fault will disappear
after a certain time. Since the duration of a transient
fault is usually only a fraction of a cycle [1], we limit
the fault duration to one cycle.
In this set of experiments, we studied the fault cover-
age of the monitors as a function of the fault detection
latency. Intuitively, we regard those SAFs that actually
lead to wrong outputs or might lead to wrong outputs
later as detectable faults. Therefore, we treat an SAF as
detectable if it satisfies one of the following conditions:












1 2 4 8 16 32




























1 2 4 8 1 6 3 2

























1 2 4 8 1 6 3 2


















Fig. 11. SAF fault coverage as a function of fault detection latency for the monitors of (a) ex1 (b) ex6 and mc, and (c) tbk, tma,
and s1488.
On-Line Monitor Design of Finite-State Machines 547
(1) it produces a wrong output within a fixed time after
the fault occurrence, which is set to 32 cycles, or (2) it
leads to a wrong state in the 32nd cycle but does not
cause a wrong output. We enumerated all the SAFs in
the circuits and injected every fault 100 times. After a
fault is injected into the circuit, we randomly applied
input patterns until the fault was detected, or the cir-
cuit started to generate both correct outputs and correct
next states, or 32 cycles elapsed. We collected the num-
ber of faults detected by the monitor at a given latency
ranging from one to 32 cycles and divided it by the
total number of detectable faults; this is defined as the
fault coverage of the monitor under the given latency
assumptions.
The SAF fault coverage as a function of the fault de-
tection latency for all the monitors is plotted in Fig. 11.
This figure shows that the monitors detect most of the
faults in the first cycle after fault injection, and far fewer
in the following cycles. Furthermore, the fault cover-
age increases gradually in the first eight cycles and
flattens thereafter for almost all the monitors. Taking
the monitor mc-5out1state1 in Fig. 11(b), for example,
we can see that 55% of the SAFs are detected at the
first cycle, and another 10% are detected in the follow-
ing seven cycles. Only a few percent of the SAFs are
detected after the eighth cycle. The fault coverage here
shows trends similar to the fault coverage under the
SBF model. Nevertheless, the fault coverage of SAFs
varies more than that of SBFs. The fault coverage of
the combinational monitors after eight cycles is 60 to
80%. In fact, detectable SAFs form only a small part
(10 to 35%) of all injected SAFs.
6. Conclusion
We have presented a systematic method to design on-
line monitors for finite-state machines. We first define
the single behavior fault (SBF) model and construct a
base monitor that detects all SBFs. A heuristic method
based on homomorphisms is then systematically ap-
plied to reduce the monitor area. The state dependency
graph is introduced to guide homomorphism selection.
We analyzed the effects of the number of states on the
monitor’s area overhead and fault coverage. Our results
show that monitors with a few states often have both
low area overhead and high fault coverage. In FSMs
with high output/state ratios, a decrease in the number
of monitor states can significantly reduce the area over-
head, while incurring marginal fault coverage loss. In
fact, it is often beneficial to compress the number of
states in the monitor to one for these FSMs. Our exper-
imental results are consistent with the analysis. Simu-
lations under the standard single stuck-at fault (SAF)
model show similar trends. Furthermore, the simula-
tions also show that most transient faults are detected
quickly—usually within a few clock cycles.
Acknowledgment
This research was supported by the National Science
Foundation under grant no. CCR-0073406.
Note
1. This is the version of mc with 5 output check bits and a 1-state
monitor as defined in Fig. 9.
References
1. H. Cha and J.H. Patel, “A Logic-Level Model for Alpha-Particle
Hits in CMOS Circuits,” in Proc. Int. Conf. Computer Design,
1993, pp. 538–542.
2. Collaborative Benchmarking Lab., North Carolina State Univ.,
http://www.cbl.ncsu.edu/benchmarks.
3. K. De, C. Natarajan, D. Nair, and P. Banerjee, “RSYN: A System
for Automated Synthesis of Reliable Multilevel Circuits,” IEEE
Trans. VLSI, vol. 2, no. 2, pp. 186–195, 1994.
4. D. Gizopoulos, A. Paschalis, and Y. Zorian, “An Effective BIST
Scheme for Data Paths,” in Proc. International Test Conf., 1996,
pp. 76–85.
5. G.D. Hachtel and F. Somenzi, Logic Synthesis and Verification
Algorithms, Boston, MA: Kluwer Academic Publishers, 1996.
6. L.G. Heller, W.R. Griffin, J.W. Davis, and N.G. Thoma,
“Cascode Voltage Switch Logic: A Differential CMOS Logic
Family,” Dig. Tech. Papers, ISSCC, pp. 16–17, 1984.
7. ITC’99 Benchmark Documentation. http://www.cerc.utexas.
edu/itc99-benchmarks/bendoc1.html.
8. R. Karri and N. Mukherjee, “Versatile BIST: An Integrated Ap-
proach to On-Line/Off-Line BIST,” in Proc. International Test
Conf., 1998, pp. 910–917.
9. H.B. Kim, D.S. Ha, T. Takahashi, and T.J. Yamaguchi, “A New
Approach to Built-In Self-Testable Datapath Synthesis Based
on Integer Linear Programming,” IEEE Trans. VLSI, vol. 8,
pp. 594–605, Oct. 2000.
10. Z. Kohavi, Switching and Finite Automata Theory, 2nd ed., New
York: McGraw-Hill, 1978.
11. H.R. Lewis and C.H. Papadimitriou, Elements of the Theory of
Computation, Englewood Cliffs, NJ: Prentice-Hall, 1981.
12. A. Mahmood and E.J. McCluskey, “Concurrent Error Detection
Using Watchdog Processors—A Survey,” IEEE Trans. Comput-
ers, vol. 37, no. 2, pp. 160–173, 1988.
13. R.K. Montoye, “Testing Scheme for Differential Cascode Volt-
age Switch Circuits,” IBM Tech. Disc. Bull., vol. 27, pp. 6148–
6152, 1985.
548 Gao and Hayes
14. A. Morozov, V.V. Saposhnikov, V.V. Saposhnikov, and M.
Gossel, “New Self-Checking Circuits by Use of Berger Codes,”
in Proc. 6th IEEE On-Line Testing Workshop, 2000, pp. 141–
146.
15. S. Ravi, G. Lakshminarayana, and N.K. Jha, “Tao-BIST: A
Framework for Testability Analysis and Optimization for Built-
In Self-Test of RTL Circuits,” IEEE Trans. CAD, vol. 19,
pp. 894–906, 2000.
16. E. Sentovich, K. Singh, C. Moon, H. Savoj, R. Brayton, and
A. Sangiovanni-Vincentelli, “Sequential Circuit Design Using
Synthesis and Optimization,” in Proceedings of ICCAD, 1992,
pp. 328–333.
17. Synopsys, Synopsys Design Analyzer datasheet, http://www.
synopsys.com/products/logic/design compiler.html.
18. N.A. Touba and E.J. McCluskey, “Logic Synthesis of Multilevel
Circuits with Concurrent Error Detection,” IEEE Trans. CAD,
vol. 16, no. 7, pp. 783–789, 1997.
19. J. Wakerly, Error Detecting Codes, Self-Checking Circuits and
Applications. Amsterdam: North-Holland, 1982.
20. C. Zeng, N. Saxena, and E.J. McCluskey, “Finite State Machine
Synthesis with Concurrent Error Detection,” in Proc. Interna-
tional Test Conf., 1999, pp. 672–679.
Feng Gao received his B.S. degree in mathematics at Shandong
University in 1997 and his M.E. degree at the Institute of Com-
puting Technology, Chinese Academy of Science, Beijing in 2000.
He is currently a Ph.D. candidate in the Department of Electrical
Engineering and Computer Science at the University of Michigan,
Ann Arbor. His research interests are in the area of low power circuit
design, power grid optimization, testing, and DFT.
John P. Hayes is the Claude E. Shannon Professor of Engineer-
ing Science at the University of Michigan, Ann Arbor, where he
teaches and does research in the areas of computer-aided design,
verification and testing; VLSI circuits; embedded system architec-
ture; and quantum computing. He received the B.E. degree from
the National University of Ireland, Dublin, and the M.S. and Ph.D.
degrees from the University of Illinois, Urbana-Champaign. At Illi-
nois he participated in the design of the ILLIAC III computer. In
1970 he joined the Shell Benelux Computing Center in The Hague,
where he worked on mathematical programming. From 1972 to 1982
Dr. Hayes was a faculty member of the Departments of Electrical
Engineering-Systems and Computer Science of the University of
Southern California. He joined the University of Michigan in 1982
and was the founding director of Michigan’s Advanced Computer
Architecture Laboratory (ACAL). Dr. Hayes is the author of nu-
merous technical papers, several patents, and five books, includ-
ing Hierarchical Modeling for VLSI Circuit Testing, (Kluwer, 1990,
coauthored with D. Bhattacharya) and Computer Architecture and
Organization, (3rd edition, McGraw-Hill, 1998). He has served as
editor of various technical journals, including the IEEE Transactions
on Parallel and Distributed Systems and the Journal of Electronic
Testing. Dr. Hayes is a Fellow of IEEE and ACM, and a member of
Sigma Xi.
