Abstract
Introduction
Correct operation of asynchronous circuits depends on timing assumptions that are much more complex than those in the synchronous case. In particular, an asynchronous circuit is by construction insensitive to most delay faults, because they often affect only its performance, not itsfinctionality. Some delay faults, though, may have an effect on the correctness as well, and hence it is necessary to be able to test them ([23] ). Unfortunately, testing asynchronous circuits is a difficult problem, due to the following main reasons:
0 All known asynchronous design methodologies ensure correct operation (hazard-freedom) by using some level of redundancy, i.e., by sacrificing testability.
0 Asynchronous control circuits tend to have more feedback and more registers than their synchronous counterparts. This means that full-scan testing may be unacceptably expensive.
This paper deals with the problem of generating test sequences for a given set of paths in an asynchronous circuit. We assume that the information about which delays ' This work has been partially supported by by the Esprit project 21949 Lavagno) .
Alexander Taubin
The University of Aizu Aizu-Wakamatsu, 965-80 Japan in the manufactured circuit must be tested to ensure correct operation is available (e.g., from the synthesis tools). Previous work in the area of asynchronous circuit testing either used greedy heuristic techniques ( [4] ) to justify and propagate stuck-at faults, or used exhaustive synchronous mode testing for stuck-at faults ( [3, 191) or used manual transformations to ensure that a simple functional testing approach could test all stuck-at faults ( [20] ), or used a fullscan approach to robustly test all delay faults ([S, 12, 171) .
We consider two versions of the path delay fault testing problem: robust path delay fault testing (RPDFT) and hazard-pee robust path delay fault testing (HFRPDFT).
The former test may allow better coverage and is simpler to generate. It guarantees that hazards in a circuit under test cannot produce false positives, but false negatives can occur. The latter guarantees that during the test, hazards cannot propagate along the paths under test, and does not admit false negatives. For sequential circuits it also guarantees that meta-stability cannot occur in the latches on the paths under test. (See [23] for an in-depth discussion of different versions of the path delay fault testing problems.) We solve the problem of path delay fault testing for asynchronous sequential circuits as follows.
Step 1: identification of a set of paths that cover all potentially dangerous faults' (Section 2.2). This is obtained by finding a set of linear inequalities that bound every relevant delay constraint (e.g., determining that the difference between two delays in a fanout stem is less than a given amount). All known synthesis procedures for asynchronous circuits provide this information either in the form of path delay bounds (e.g. [13]) or in the form of constraints on the relative delays of the branches of a fanout stem (e.g. [ 101).
Step 2: reduction to asynchronous circuits with acyclic behavior (Section 2.1). The problem of testing an asynchronous circuit is reduced, by using a partial scan approach, to that of testing an object called an asynchronous chronous latches (e.g., Muller C elements, with inputs a and b and next state equation c' = ab + ac + bc). An asynchronous net is still a sequential object with internal memory, but it can exhibit only acyclic behavior.
Step 3: test sequence generation. For a combinational circuit, a delay fault test consists of pairs of vectors net, in which feedbacks are allowed only inside asyn-'A delay fault is "dangerous" if it violates some assumption made during synthesis, e.g. a fundamental mode constraint, an isochronic fork and so on [13].
< "0, VI > applied to the primary inputs of a circuit. Vector vo sets the outputs and the side inputs of the gates along the path under test to values which allow the propagation of the desired transition when 211 is applied. For sequential circuits, testing a delay fault requires in general the application of a sequence of vectors. The first part of this sequence performs correct initialization of latches along the path. The second part of the sequence is a testing pair < vo,v1 >, where the setup vector, vo, sets the outputs and the side inputs of the gates along the path under the condition that all latches are already initialized as required. The test vector, v1 propagates the transition along the path.
We decompose the problem of testing asynchronous nets into that of initializing memory elements, followed by path delay fault testing.
Step 3a: generating testing airs < v 0 , q > (Section 3). This problem is solved by reduction to stuck-at test pattern generation for a combinational circuit that can be directly derived from an asynchronous net. This approach was proposed in [21] for WDFT of combinational circuits.
Additional conditions on the generated stuck-at test patterns for reduction of WFRPDFT are given. The method is further generalized for sequential nets, by modeling each latch with a combinational model (similar to modeling of latches in time-frame unrolling [I] ). We derive conditions on the value of the state inputs under which the test for the combinational circuit is valid also for the asynchronous net, without resorting to time-frame unrolling.
Step 3a:
Step 3b: generating initialization sequences (Section 4). Vector vo obtained at step 3a is the target of the initialization procedure. We present a heuristic algorithm for monotonous initialization that (if successful) generates initialization sequences bounded by n2/2, where n is the latch count. Otherwise, we resort to classical time-frame unrolling [ 11 (that has an upper bound on the test sequence length of 4"). We improve with respect to previous work because our approach:
Is complete, because it finds a test sequence for a given fault if one exists (while [4] heuristically maximized the number of tested paths, by using a greedy search algorithm). Note that previous work ([12] ) has shown that asynchronous circuits generated with every known synthesis technique can be tested for delay faults by using a full-scan approach, so we can claim that every delay fault can be tested using the proposed method.
Is automated (while [2Q] requires the designer to manually insert special circuitry, acting only in functional test mode, under guidance from a testability analysis tool).
Requires only partial scan (while [SI required full scan and required additional test inputs and [ 171 is based on full scan and uses transformations of combinational logic increasing the level of testability).
Requires only the output of a memory element to be scanned (while [12] required both inputs to each element to be independently scanned, that in general can be quite expensive).
The paper is organized as follows. Section 2 reviews the basic notions of delay fault testing and adapts them to asynchronous sequential circuits. Section 3 describes the reduction of HFRPDET of sequential nets to that of combinational nets. Section 4 presents a procedure for initialization of asynchronous nets. Section 5 provides experimental results.
A~y~c h~~n~M § c~i r c~~t s
and nets An asynchronous circuit is an arbitrary interconnection of logic gates and iriput nodes, with each gate input connected to strictly one gate output or one input node, and with no two gate outputs tied together. Feedback can be either local inside gates (like SR latches or C-elements) or global outside gates.
Our strategy for testing asynchronous circuits is based on breakmg all global feedback loops, by selecting a Minimum Feedback Vertex Set of the circuit graph, and converting all its gates into scan memoxy elements (like [5, 141 in the synchronous case). Such transformation is obviously easier and cheaper if the selected gates are memory elements (see [12] for a scan SR latch circuit). Outputs of such gates then become simultaneously new primary inputs and primary outputs of the circuit.
We call the resulting circuit, in which feedback can only be local, an asynchronous. net. In this paper we will consider a particular class of asynchronous nets, which are composed from simple gates (AND, OR, NAND, NOR, and NOT) and C-elements. Using the macro-expansion operator [16] , any complex gate can be converted to an equivalent connection of simple gates preserving testability properties. Handling asynchronous memory elements other than C-elements is a possible area of future work.
Identifying the
An asynchronous circuiit operates correctly without hazards only if some delay constraints, which differ according to the design style used, are satisfied. All such constraints can be formulated in terms of comparisons among event propagation times along some circuit paths. For example, speed-independent circuits (I lo]) operate correctly if and only if all the branches of ii multiple fanout point have similar delays (generally, the maximum admissible spread is comparable with one gate delay).
Let us model the delay of each wire i in a circuit by using a variable di. In that case, the set of delay constraints that ensure the correct operation of the circuit can be modeled by a set A of linear inequalities over those variables. The problem, then, is to find a set of paths that allow us to grove that A is indeed satisfied, by bounding the delay along them. In other words, we would like to be able to find a set 2, of linear inequalities, each involving a measurable delay along an 110 path of the corresponding asynchronous net, such that the set of feasible solutions (assignments to the dis) of A U C D is the same as that of D.
The simplest solution to this problem is to greedily add testable path after testable path, until the inequalities in A all become redundant (Le,, the assumed delay bounds are implied by the measured delay bounds). A better solution, that is left to future work,, would require to minimize the cardinality of the set of tested paths.
Note that in this case the trade-off between test sequence length and speed at which the circuit can operate ( [I 11 ) is possible only if the objective of the test is the determination of the actual performance of the circuit, because no compromise about its correctness is generally possible.
Path delay faults
In this paper we will use delay fault testing models originally developed for combinational circuits, and extend them to asynchronous nets. 
ition 2.2 A controlling value for a gate g (denoted as C ( g ) ) is a value of one of its inputs that determines the value at the output independent of the other inputs. Otherwise a value of the input is called non-controlling and denoted N C ( g ) .
Note that a C-element has no controlling values, since the next value at the output always depends either on the value at both inputs, or on the previous value at the output. We then extend the definition to asynchronous nets as follows.
D~~n i t i o n 2.3 A controlling set for a gate g (denoted as

CS(g)) is a set of values at some of its inputs that determines the value at the output independent of the other inputs or the previous state of the gate. Otherwise a set of values at some inputs is called non-controlling and denoted
are two controlling sets and {0,1} are two
NCSk?).
For example, { 1,l) and for a two-input C-element non-controlling sets. A setup vector W O for a combinational path has two functions: (1) it sets the outputs of all the gates along the path, and ( 2 ) it sets the side inputs of the same gates to values which allow the propagation of the desired transition. The first function is called initialization, the second is called setting. This may also work for a sequential path, as shown in Figure 1 . Assume that path 7r1 = {x, 6,s) is under test for the rising input transition. It is easy to see that vector WO =< x = 0, Ri = 0, Ai = 1 > will initialize the Celements 4 and 8 into state O and also will set the side-input for gate 6 on the path n-1 = {x, 6,8) and the side-input for gate 7 on the path 7r2 = { x, 7,8} at non-controlling values.
Then, by applying vector V I =< x = 1, Ri = 0, Ai = 1 > a rising transition will propagate from input 2 to output Ao along two paths 7r1 and 7r2. If there is a delay fault in at least one of the paths, it will be observed at the output.
Multiple paths can be tested with the same pair of vectors (see Figure 1) . In that case, more than one constraint is obviously added to the set V that is used to bound the timing assumptions.
However, as will be shown in Section 
Path Delay Fault Testing
In this paper we consider two possible approaches to path delay fault testing ( [21, 7, 23 The difference between these two models for combinational nets is illustrated by checking the testability of path 7~ = { c , 1,6) in Figure 2 . As shown in Figure 2 ,a test pair < WO, w1 >=< 110,111 > (the variable order is < a, b, G >) may cause a dynamic 1-to-0 hazard at the output of gate 6. However, if x has a longer delay than expected, then a falling transition at the output of gate 6 is delayed. In this case hazards cannot invalidate the test and < 110,111 > is a robust test. If the output of gate 6 is observed at t l , when it has value 1, then we correctly conclude that there is a delay fault along T . If the output is observed at t 2 or t 4 , when the output is at 0, we correctly conclude that there is no delay fault along R. However, if the output is observed at t 3 , when the output is at I due to hazards at the output of gate 5, then a false negative occurs. We incorrectly report a delay fault along x. Therefore, the robust test is conservative and can produce false negatives. AsshowninFigure2,btestpair < 'uo,vl >=< 100,101 >, propagating transition along two paihs R = ( c ,
(including C-elements) on the path under tcst. This means that any C-element that may enter a ~e~a -s~a~~~ state due to a hazard is also forced to leave the meta-stable state by the time 'u1 finishes propagating. Nevertheless, if (due to a delay fault) the output of thc ~-e~e~e~~ i s ~~?~e~e d at t l , when it is in ~e~a -~~a b l e state, then it may cause uncertainty in the test ~a~~~n~" h ~~~~-s~a~~~ output either keeps a value between the Bogiea? "6)" and logical "1" or oscillates. Hence the effect of meta-stable values on the primary outputs is similar to that of ~i a~a~~s , cause false negative results.
We will then discuss both ~a z a~-~-~~e e and non-hazardfree testing, because the latter may allow better coverage at the expense of more false negative results. This definition also applies to C-elements. In particular, condition 1 requires that the input values of every C-element on R is a controlling set under the vector V I , and the C-element output has the opposite value under vo. there is no constraint on the transitions on side-inputs for C-elements, other than that specified by condition 1.
Definition 2.5 does not restrict transitions at the sideinputs if g;-l(q) has a non-controlling value. Therefore hazards may occur. Testability for the falling input or rising and falling output transitions is defined similarly and differs only in condition 2 (e.g., for the falling output transition condition 2 is as follows: if O P ( g i ) = even, then gi(vl) = 0 otherwise For the HFRPDFT one more condition must be added to prevent hazards along the path under test: (5) ifgi-l(vl) E NC(g;) or gi is a C-element, then either there is no transition on fj or there is one monotonous transition on f j such thal f j ( q > = gi-l(u1).
In the next section we develop the theory for HFRPDFT, since it is the most complex case. A simplified version of the theory, that applies to RPDFT, can be easily derived as well.
Reduction of HF
The problem of HFFtPDF test generation for asynchronous nets is solved by reducing it to classical stuck-at test pattern generation for a combinational circuit which can be directly derived from the asynchronous net. Since asynchronous nets contain latches they exhibit sequential behavior. Hence, in general, a stuck at test for an asynchronous net is not a single vector but may require applying a sequence of vectors. For this reason the r~u c~~o n is done in two steps:
%(VI) = 1).
0
Relating sequential H FRPDFT t~ combinational HFR-PDFT (Section 3.1)
Relating HFRPDFT for a combinational circuit to stuck-at test generation for a slightly modified combinational circuit. This approach was proposed in [Zl] for RPDFT of combinational circuits. Additional conditions on the generated stuck-at test patterns for reDlT can be found in 191.
Note that, in practice, the first step is performed relative to a particular test vector pair (Section 4 describes one such approach), so the order in the algorithmic implementation is reversed, Moreover, poltential~y there is a need to backtrack and select another test pair if it does not satisfy the initializability conditions.
Given an asynchronous net, C, let us substitute each The conversion of C-elements to combinational M-gates is purely logical and is done for the algorithm of test generation for the original sequential circuit, no actual physical transformation of the original sequential circuit is required.
ition 3.1 The c o~b i~t~o n a l circuit obtainedfiom an hronous net C by replacing each G-element with an M-gate, is called an net and is denoted M ( C ) .
If an asynchronous net has primary inputs I = ( i l , ... ,is) and C-elements L = {c1 ,... , c l } , then the corresponding M-net has k. + I primary inputs {il,. . . ,ik, 1121,. . . ,ml}. The following conditions determine the value of the M-input that implies a stable behavior of the Ill-gate if its output is connected to its input (thus a C-element again). Figure 3 shows an exam--net, corresponding to the asynchronous net from Figure 1 .
Cion 3.2 A vector v of inputs to the M-net is called tent with ~n~t i~l i z~t~~n i f for each M-gate, M j , the following condition is satisfied: cj (w) = mj (v).
h other words, the final value cj at the output of each Mgate after applying v is the same as that of the M-input mj .
Let v =< VO, . . . , wk > be a binary or ternary vector (vi E {0,1, -}) for variables from set X and let 2 C X.
Then U 1 2 denotes the sub-vector of v corresponding only to variables from 2. The following theorem states the conditions under which a combinational logic test derived for an M-net is valid also for the corresponding sequential asynchronous net. The proposed monotonous initialization procedure begins from those C-elements that are closer to the primary outputs (in backward topological order). The polynomial bound on the (possibly non-existing) monotonous initialization sequence length is due to the fact that a C-element is not disturbed after having been set.
We can associate four Boolean functions (defined over the space of primary inputs and C-element outputs) with each element g (gate or C-element) of an asynchronous net.
o Sl(g) and SO(g) -setting g to 1 or 0 respectively and e H l ( g ) and HO(g) -holding g to 1 or 0 respectively. If g is a basic combinational gate with output function f then Sl(g) = H l ( g ) = f while SO(g) = HO(g) = 7.
In case of a C-element, the holding and setting functions are different. For C-element cj with inputs i l , . . . , ik they are: Sl(cj) = Hl(i1) * ... * Hl(ik) and Hl(cj) = cj * (Hl(i1) $-...+ Hl(ik)) (similarly for SO(cj) and HO(cj)).
To set a C-element cj we must apply an input vector under which the corresponding setting function evaluates to 1. However only for C-elements of the first level the value of a setting function is completely determined by primary inputs. For cj in level i the setting function depends also on the outputs of C-elements from the lower levels. Therefore the process of setting cj may require the recursive setting of "preceding" C-elements.
Let us consider in more detail the process of setting cj to 1 by using an input vector w (resetting it to 0 can be done similarly). Let C, denote the C-elements from levels higher than i. If ZI sets cj then two conditions must be satisfied:
I. there exists cube /3 E S 1 ( c j ) such that fi 1 I covers U .
C-elements
Cp whose outputs have a value of 0 or 1 in /? (these are the C-elements on which cj depends) have already been set by previous input vectors w l , . . . , w,.
Application of vector U after w, can lead to the following a cm E Cp can change its value inside the transition cube between w, and 2'. In such case the value of /? in v does not evaluak io 1 and ej is not set.
difficulties:
c, E C, can change its value inside the transition cube between w, and U. In such case the requirement of monotonicity of the initialization procedure is violated.
These two conditions restrict the set of valid vectors v that can be applied after w,, leading towards VO, that is the objective of initialization. If we denote by Hold the product of the holding functions of C-elements in C, U CO, then a valid transition path between w, and v must belong to Hold. The task of finding a valid path can be reduced to a search in a graph with: (1) vertices corresponding to cubes of Hold and (2) edges between every pair of intersecting cubes from Hold.
If no valid path exists, then cj cannot be set by the cube , O and another cube from the setting function of cj is tried. If we fail to find such a path for all cubes c j , then we need to backtrack. The procedure converges, because full scan testing is always possible. 
Sl(Ao) = z * HO(Ro) * (x + Hl(Ro)) Hl(Ao) = Ao * (z * HO(Ro) + 2 + H~( R o ) )~ tions for Ao we get: Sl(Ao) = x%Ai + x X R i and
From Sl(Ao) it follows that to set Ao at level 2 we first need to reset Ro at level 1. The latter can be done by the vector w1 =< Ai = 1 , e = 1, Ri = 1 >. Note that after Ro is reset, the same vector w1 sets Ao to 1.
The next initialization step is to set Ro. To do this we can try cube X Z R i of function Sl(Ro). However, if we apply vector w2 =< Ai = 0, x = 0, Ri = 1 > after w1
we cannot keep the value 1 on the output of Ao because H 1 ( Ao) is equal to 0 under w2 (remember that after w I , Ro is reset to 0). Therefore no valid path from w1 to w2 exists and we need to try the next cube in Sl Generating a sequence of initialization vectors to set the output value of each C-element along the path being tested to the desired value.
As already mentioned, if initialization does not succeed for a chosen test vector pair, ,additional vector pairs are selected until initialization succeeds for at least one or fails for all. Circuits from the first class are characterized by a very high density of signal interconnections. We checked four techniques for achieving high-testability: full input scan, full output scan, partial output scan and partial output scan with splitting of primary inputs. Splitting of input connections (that was defined in 1121 for true and complemented phases only) is a powerful technique to increase the testability of asynchronous circuits, because it reduces the redundancy level. As illustrated by Figure 5 , input scan requires scan-in and scan-out operations for both inputs of a latch, output scan requires scan-in and scan-out only for the output of the latch. Fork and phase splitting requires scan-in in addition to scan out. Fork splitting means the possibility to scan independently the fanout branches of a fork. Phase splitting means the possibility to dnve independently the true and complemented phase of each primary input and sequential element. Table 1 presents the results for speed-independent control logic. The first two columns describe circuit complexity: the number of paths and the number of non-input (i.e., feedback) signals of the circuit. There is no column on runtime because for our small circuits test generation always took less then 10 msec. The "approach of [12]" columns show the percent level of testability for input scan and for output scan of all non-input signals respectively, using the technique of [ 121. The "partial scan" columns show the level of testability for a selected number of scanned signals for the output scan technique. The last group of columns shows how the level of testability can be increased if the splitting technique is used in addition to partial scan. The column labeled "split" gives the number of split signals. For example, the best level of testability that can be achieved by partial output scan for circuit "converta" (two out of three signals are scanned) is 64%. If one additional signal is split, then testability reaches 71%.
The sets of signals for partial scan and for splitting are selected to achieve the required testability level. In Table 1 we attempt to reach a testability level of 70%, and limit the number of signals which are allowed to be scanned and split as explained in the algorithm Figure 6 . For example, we do not allow more than one signal to split for circuit "converta" and to scan more than five signals for circuit "master-readcsc.map2". The requested level of testability cannot be achieved by scanning only five signals in "master-readcsc.map2", but can be achieved by splitting two additional signals.
Our algorithm for selecting a set of signals for partial scan and for signal splitting operates on the directed graph Circuits from the second class are known to have high stuck-at testability. We expected that a high level of path delay fault testability could be achieved with a low scan ratio. Table 2 presents repeat 1% Updating scan set *I Add a vertex with a maximal degree to Scan; If testability is less than requested then repeat /* Updating split set */ Split a signal g E Scan with a max. out-degree; Choose hi-partition of g's fan-out set to minimize the number of reconvergent paths;
until The requested level of testability is achieved the scan limit is exceeded or the splitting limit is exceeded;
until The requested level of testability is achieved or end Figure 6 : Algorithm for the two operands, the sum-in and the carry-in, and output latches for the sum-out and the carry-out. To obtain full testability, four feedback wires between the result latches and the operand latches must be scanned. The pipelined adder example is a DIMS-adder incorporated into the pipelined ring. A few interesting observations can be made:
Q Although dependency graphs are very dense for speedindependent random control logic, partial scan augmented with splitting techniques can provide a high level of testability.
Q Full output scan (which can be achieved by inserting transparent latches after each C-element) provides a relatively good testability level around 80%-90%. This gives, to the best of our knowledge, the first experimental evidence that speed-independent circuits are easily testable even when more accurate fault models than pure output stuck-at are used ([2] first proved their self-checking properties with respect to this kind of fault).
Q Experiments with delay-insensitive adders show that the optimization technique used for area efficiency reduces the level of testability from 100% in the DIMS adder to 74% in Martin's adder (which is competitive in area and faster than a synchronous ripple-carry adder). We may conclude that the optimization transformations which correspond to quasi-delay-insensitive substitution do not retain testability.
Conclusions
In this paper we have described a complete path-delay fault testing algorithm for asynchronous sequential circuits. We have shown that it is possible to perform such tests by partial scan on a sequential object called an asynchronous net. We defined the set of paths that must be tested to check all the timing assumptions. We decomposed the testing problem for sequential circuits into:
1. insertion of enough scan elements to make the asyn-2. initialization (using a heuristic technique, with a fall-
test pattern generation, by reduction to combinational
Experimental results show that the technique is effective in providing a substantial savings in the number of scan memory elements, versus a small reduction in the testability figures. This is true of control-dominated dense circuits, and even more of regular data path objects.
