In this paper, no attempt to benchmark systems (in the sense of ranking the systems) will be given. Rather, the intent is to present an approach for the quantification of the attributes of CATS systems for the sake of selecting benchmark circuits. First, the role of benchmarking in other fields will be briefly examined in order to gain more insight about the selection of the benchmark circuits. Then the different functions of a CADS will be reviewed in order to abstract attributes that need to be assessed.
ith increased device density on a chip, testing cost has become a major proportion of total system development cost. This proportion "has increased from 10% in the mid seventies to 25-50% today" [7] . The increase in testing cost is expected to continue due to the fact that many problems are still encountered with present testing techniques [45] .
In the development of test sets as well as in verification of the design for testability techniques, computer-aided design and test equipment have become indispensable. In this paper, the software tools which are generally used in testing--Automatic test pattern generators (ATPG), logic and fault simulators as well as testability analysis programs--will be referred to as Computer-Aided Testing Systems (CATS).
As the demand on CATS systems increases and the choices become more varied, it is rather important to compare the merits of different systems. Twenty-nine systems were reported in a survey conducted by HH&B on behalf of the Navy [39] . Not all of these were equipped with ATPGs. Only a few of these systems survived the fierce competition among developers and were listed in a later survey [47] which reported over 45 systems.
In this paper, no attempt to benchmark systems (in the sense of ranking the systems) will be given. Rather, the intent is to present an approach for the quantification of the attributes of CATS systems for the sake of selecting benchmark circuits. First, the role of benchmarking in other fields will be briefly examined in order to gain more insight about the selection of the benchmark circuits. Then the different functions of a CADS will be reviewed in order to abstract attributes that need to be assessed.
We present some experimental results that will help in the characterization of CATS components. The focus is particularly on Automatic Test Pattern Generators, although many of the results can also be applied to fault simulators. First, the different functions of a CATS system are reviewed in order to abstract the attributes that need to be assessed.
The experiments are carried out using three CATS systems and several benchmark circuits. They help assessing" 1) different approaches to fault coverage and collapsing, 2) the behavior of ATPGs--length of the test set and generation time--with different circuit complexity and topology. Criteria [22] . Subsequent efforts in benchmarking computers [4] , [17] , [20] , [41] , [51] resulted in the following conclusions: 1) it is not pos- [27] , their graphics capabilities [9] and the correctness of floating point operations [24] . Benchmarks can even be tailored to assess a particular operating system such as UNIX [3] .
In the digital design and testing community, some circuits have been used as test cases by researchers to demonstrate their methods. The infamous 181 ALU has been used by many researchers [31] , [32] , [23] as well as by commercial systems [2] . In a study on minimization techniques of PLAs, 56 benchmark circuits were used [8] . In logic synthesis, several suites of benchmarks have been announced [10] , [11] .
At the 1985 International Symposium on Circuits and Systems (ISCAS) a set of 10 combinational circuits were proposed [10] to demonstrate test pattern generation by several researchers. As soon as these circuits were made public they became very popular and were used in many studies [1] , [16] , [38] , [42] , [43] , [44] , [50] , [49] . Sequential benchmark circuits were proposed subsequently, ISCAS 89 [11] .
Another indication that benchmark circuits are needed in digital testingin 1979, a report on benchmarking CATS was published by the Navy [39] . Almost ten years later, the issue was again discussed by Greer [21] , and a whole session was devoted to the same topic at COMPCON Spring 88 [14] , [18] , [29] , and 1988 International Test Conference [35] . In the remaining sections of this paper, the presentation will focus on CATS in which: 1) circuit representation is on the gate level, and 2) only single stuck-at faults are considered. Fault modeling, collapsing and coverage will be examined first. Then the performance of the ATPGs will be analyzed using three CATS which are referred to as systems A, B, and C. They run respectively on an Apollo DN550, a Microvax and a Sun 3/160. It is important to reiterate that the intent is not on ranking these systems but on using them as vehicles to formulate guidelines and recommendations for the selection of appropriate benchmark circuits.
COMPuTER-AIDED TESTING SYSTEMS

FAULT MODELING AND COVERAGE Fault Modeling
The stuck-at fault is the most widely used fault model. Although it is recommended [19] to consider up to 6 simultaneous faults, none of the commercial CATS attempts to detect multiple faults. Fault collapsing and fault dropping are other issues that affect the performance of the simulator. According to this latter technique, once the fault is detected it is dropped from the fault list and no further attempts will be made to detect it with other patterns. This is a disadvantage if, for example, the detectability profile for a circuit is to be constructed [28] . Yet affected. The simulators allow the use of either option or both. When both are used, stuck-at faults on the branches can be included in the fault list. This is the case for the output of gate 8 in Figure 3 (a). However, for fan-out free outputs, the SA faults are duplicated as illustrated in Figure 3(a) , where the SA fault at the output of gate 9 and the open-to fault at the input of gate 11 are the same fault. To include faults on the branches without duplicating faults on the fan-out free outputs, special work around techniques have to be used in the case of system B, but can be easily implemented for system C. [10] . The results of this experiment are given in Table  III , where the number of nets (the number associated with the circuit's name), the gate count, the fault classes, the test length, the test generation time and the fault coverage are listed. The change in test length and generation time with the number of wires is shown in Figure 4 . These results indicate that there is no correlation between the number of patterns Figure 5(b) , the fault coverage of system A is better than that of system B for all circuits except C499.
Since for a given system, the same algorithm is used on all circuits, the profiles shown in Figures 4  and 5 Figure 6 , to form larger comparators [37] . Figure 8 . The values are normalized to the generation time for the smallest circuit (a 4-pairs comparator and a 4-input parity tree). In order to emphasize the difference between the two curves for the small size circuits, a logarithmic scale is used. The curves for the comparators indicate that the two ATPGs behave exponentially but the rate of increase is higher for system C than for system A. For the parity trees, however, the curve for system A rises at a higher rate than that for system C. The different behavior of the two systems can be explained as follows" system C generates only 4 patterns for any size parity tree (a correct optimum test set [6] ) while for system A, the number of test patterns increases with increased input pairs.
The behavior of systems C and A for the two types of circuits is shown in Figure 9 . In this case the time is plotted versus the fault classes. It is clear that for [40] . This last suite consists of comparators, parity trees, ALUs, funnel shifters, counters, and other circuits. [46] and EDIF [13] , among others [12] , [48] . It is thus natural to explore the possibility of exchanging the circuit descriptions through these intermediate languages. This is illustrated in Figure 11 . Here each system is both a source and a target. It is then mandatory to have a translator from each system's language to the neutral language, a writer, and vice versa, a reader [13] . Thus [30] . Also, a netlist in EDIF can easily be put into a schematic form and vice versa.
The translation of the NSF 89 benchmarks into EDIF is reported at EDIF World 89 [26] .
SUMMARY AND CONCLUSIONS
Computer-Aided Testing Systems as computer software packages need to be assessed and compared. As benchmark programs are, written to evaluate computers, digital circuits need to be selected to evaluate testing systems. Like computer systems, CATS require several types of benchmarks, since no single benchmark can suffice to measure all attributes of a system.
The focus in this study was on Automatic Test Pattern Generators although many of the results can also be applied to fault simulators. Experiments were carried out using three CATS and several benchmark circuits. Only combinational circuits were considered. Further investigation will make use of sequential circuits for which test pattern generation presents more complexity [34] . The 
