Abstract
Introduction
A control part of digital systems is usually the most critical part from testability point of view. Irregularity and complexity of the control structure on the one hand, and its central role in functioning of the whole digital system to be controlled on the other hand, causes problems of both synthesis of self-checking controllers, and analysis of their efficiency. In this paper, we deal with the problem of analysis of the efficiency; namely, we focus on investigation of characteristics of latency for Finite State Machine (FSM) based controllers being checked on-line.
In paper [Shedletsky 76 ] a method for computation of testing sequence length required to detect faults of an offline tested sequential circuits is proposed. The method consists of constructing a specific Markov process with (R + 1) states, where R is the number of states of the FSM. An additional (R + 1)-th state is an absorbing state, and the matrix of transient probabilities of the process is constructed in such a way, that the process moves to the additional state if a fault is manifested under the testing sequence. Actually, the method described in [Shedletsky 76 ] can be adapted to the on-line testing. We propose a new method, which seems to be advantageous from the point of computational complexity. The obtained results allow fully considering the latency as a random value.
Since the method still requires accurate calculations and significant information about the FSM and conditions of its functioning, we propose especially at the initial state of the design, to estimate only the average latency using the limited preliminary information about structure of the FSM. Based on this, we will estimate the lower of the average latency, using only the number of states of FSM and the maximal length of a product term. This paper is organized as follows. Section 2 describes the basic definitions and assumptions. The latency distribution functions for faults in the FSM and in the checker are shown in section 3. Section 4 considers the upper bound of the average fault latency. Experimental results are presented in section 5. The paper concludes with section 6.
Definitions and Assumptions
Let us describe a finite state machine (FSM) 
, Ω } is an ordinary probability space with the set of elementary events I and σ -algebra Ω with the probability measure p [Feller 71 ]. We thus postulate a probabilistic model of random action on the FSM. The probabilistic behavior of an FSM can be analyzed by regarding its transition structure as a Markov chain [Kemeny 67]; in fact, it is sufficient to attach to the outgoing edges of each state a label, which represents the probability for the FSM to make that particular transition to obtain a finite state model that matches the definition of discrete-parameter Markov chain.
A discrete-parameter Markov chain
is a stochastic process such that the number of possible states is finite, and the parameter space T is discrete. The Markov property says that the random variable representing the future behavior of the system does not depend on states reached in the past but only on the present state.
In this paper we consider homogeneous Markov chain. In this case, the Markov chain has stationary transition probabilities and it is defined by a transition probability matrix ) ( ms p , where ms p is a probability of transition from the state m to the state s.
The fault model used in this paper is a general model of single stuck-at faults.
As commonly accepted [Lala 00], manifestation time of a fault is the time between the moment when the fault occurs, and the moment when the fault manifests itself. Unlike to the commonly used definition, we consider the manifestation to be any violation or distortion of the FSM correct operation, not obligatory followed by errors at the output. This suggestion enables distinguishing between potential and real fault latencies (Figure 1 
Figure 1. Fault latencies in a FSM
We say, that the manifestation time of a fault, taken in the above-mentioned sense, is a potential fault latency. A potential latency is a feature of the FSM as such, without any checker. The real fault latency is a feature of the FSM combined with a checker.
Note that one and the same checker may reach or not reach the potential fault latency. Moreover, the selfchecking FSM (combined with a checker) may achieve a low latency (up to the potential) for one class of faults, and quite high latency for another class of faults. Below, we provide an example demonstrating the difference between the potential and the real fault latencies.
According to the definition of the self-testing feature, being the necessary requirement of the totally selfchecking property (TSC), for each fault there is an input vector occurring during normal operation and producing a non-code output vector [Nicolaidis 98 ]. In light of that, manifestation of a fault does not necessary lead to the appearance of a non-code output vector. Consequently, the FSM does not possess TSC for such faults. At the same time, these faults could be detected by using a novel architecture of Algorithmic State Machine (ASM) based self-checking controllers [Levin 99] .
In this paper, we assume that all faults are single, which means that the probability of occurring of the second fault during the latency of the first fault is negligibly small. We assume that the FSM has random inputs (is under random action); the latency of each fault is a random value that is characterized by its distribution function. Our purpose will be to determine the latency distribution function.
Latency distribution functions
The main idea of the proposed method is to divide of the whole set of possible trajectories of the random process describing behavior of the FSM, into two subsets. The first subset does not contain trajectories manifesting a particular fault. We will call it a non-manifesting subset of trajectories. The remaining (second) subset of trajectories includes the fault manifesting states. Then, the probability of the fault manifestation at the t-th step is equal to the probability that the process moves along the trajectories 
Latency distribution function for faults of input variables
Let probabilities of the random variables x l :
Then, behavior of the fault free FSM is described by Markov chain with the following transition probabilities matrix: will be:
Note that matrices (2) - (4) describe not all the possible transitions, but only the fault-free ones. Thus, the matrices are not stochastic, i.e. the sums of elements in some rows are not equal to one. Let the following vector:
is the vector state probabilities of the FSM at the (t-1) th step after the fault has occured but has been not yet manifested. Let us introduce a vector-column:
with ones placed at the positions of the fault. T is a symbol matrix transponing. This is vector state probability that the fault will be manifested during one step. The probability of manifestation of the fault 1 / 1 x at t-th step (that is the distribution function ) (t P f of the latency) can be expressed as follows:
where vector ( ) Figure 2 presents results of computation of the probability that the latency is more then t, i.e.
F(t) = Pr(latency
, with use of (7). To demonstrate the difference between the potential and the real latencies we will consider the fault 1 4
x . Its character is seen in transitions 7 and 8 of Table 1 . These transitions initiate one and the same microinstruction. For this fault, if input x 1 =0, and input x 4 =0, the both corresponding product terms will be equal to one. We consider this fact as a fault manifestation. Thus, though the fault is manifested, the manifestation at the output will be masked and consequently there will not be errors at the output of the FSM. An example of distribution functions of potential and real latencies, performed using the proposed method for fault 1 4 x is presented in Figure 3 . The mentioned fault can be detected, for example, in the architecture from [Levin 99 ]. In this case, the real latency is actually equal to the potential latency (curve 1).
As it has been shown, manifestation of a fault may not lead to occurrence of errors at the output. Consequently, the FSM could not possess the TSC property for such faults. The self-checking architecture from [Levin99] enables detection of faults that may not lead to appearance of non-code outputs, although such faults can be detected
Fault latency distribution function for output variables
Let fault 1 / 1 z occurs. The fault will not be manifested if microinstructions containing variable z 1 are initiated (see states q 2 , q 3 and q 5 of the exemplary FSM). Using the above described method, we separate such a subset from the whole set of the trajectories, which does not allow manifesting of the fault 1 / 1 z . As it follows from Table 1 , the matrix corresponding to the subset will be: The distribution function of the latency, that is the probability of the fault manifestation at the t-th step, is
where ( )
is the vector of the FSMs states at the moment of the fault rise.
Latency distribution function of the FSM memory
If states of the memory are coded by a non-redundant code, faults are not detectable, but if any redundant code is used, the memory latency depends on specific characteristics of the code. As an example, we will examine the "one-hot" code. 
. (12) Then the matrix that allows separating the nonmanifesting set of trajectories has zeros in r-th row and rth column:
. (13) In our example matrix (12) takes form (1). If
is the vector of initial states probabilities (at the moment of the fault rise), the latency distribution function, being the probability that by k-th step the fault will not be manifested, is:
p , is the element of matrix ( 12 ).
Latency distribution function of the checker
For obtaining the distribution function of the checker we use the above-described method of constructing two matrixes . (17) Now, to obtain the distribution function we apply formula (11) with ( )
Latency of a group of faults and average latency
Let now { } 
Upper bound of the average fault latency
The described above method of latencies computation allows for completely describing the latency as a random value. These methods require considerable calculations and significant preliminary information about the structure and parameters of the FSM. However, especially at the initial stage of design, it is possible to estimate only the average latency on the basis of quite limited information about the FSM.
We will estimate the average fault latencies of FSM by constructing the "worst" case of the FSM. We also assume that: Pr(x l =1) = p l =0.5,
It allows obtaining an unimprovable upper bound, for the structure of the FSM as defined.
We say, that the set of product-terms is of the triangular form of the length K, (K ≤ x N ), if
Then, the following two theorems we formulate without proofs. Theorem 1. For FSM with the triangular set of productterms of length K as transition functions and equiprobable single faults, the average probability of the fault manifestation (over the single faults of all variables) is minimal and equals to
Theorem 2: Among all FSMs with Q N states and x N input variables the automaton that is shown in Figure 4 has the maximal value of the average latency for single faults. This value is equal to:
( ) 
Experimental results
Several FSM controllers were used as benchmarks in the research. Each FSM describes functioning of a special purpose microprocessor. Table 2 illustrates FSM benchmarks parameters and results of: 
Conclusion
The paper introduces concepts of the potential and the real latencies and proposes a methodology of computation thereof for on-line checking FSMs. The concept of the potential latency allows to estimate a theoretical lower bound of the real latency for self-checking systems.
Exact expressions of statistical characteristics for the latencies are obtained. The upper bound of the average latency and the corresponding "worst" case FSM are presented. The proposed approach can be used at the initial stage of designing a self-checking FSM for estimating a possible latency of the FSM to be design. 
