The ASIC design flow is rapidly moving towards higher description levels, and most design activities are now performed at the RT-level. However, test-related 
Introduction
In recent years the application-specific integrated circuit (ASIC) design flow experienced radical changes. Deep sub-micron integrated circuit (IC) manufacturing technology is enabling designers to put millions of transistors on a single integrated circuit. Following Moore's law, design complexity is roughly doubling every 12-18 months. In addition, there is an everincreasing demand on reducing time to market. With complexity skyrocketing and such a competitive pressure, designing at high levels of abstraction has become more of a necessity than an option.
In this scenario, high-level test pattern generation is increasing its industrial relevance [ 151. Designers would like to foresee an ASIC testability before starting its logic synthesis. The design practice is pushing the insertion of design for testability (Dff) structures up to the register-transfer (RT) level, and their effectiveness should be evaluated as soon as possible. In addition, it has been increasingly observed that gate-level sequential automatic test pattern generation (ATPG) techniques may take unacceptable amounts of computing time and resources to tackle larger sequential circuits unless design-for-testability structures are used. High-level ATPG tools are expected to exploit compact information about design structure and behavior, and to generate high-quality test sequences more efficiently. Moreover, it is supposed that high-level generated test benches could be able to detect faults that would be very hard for gate-level ATPGs [ 181.
This paper presents Prince, an algorithm for implementing a high-level ATPG The proposed technique mixes a code coverage-oriented approach with faultoriented optimizations.
Section 2 sketches some background information, Section 3 details the algorithm, Section 4 reports some experimental results and Section 5 concludes the paper.
Background
Tackling test issues above the gate level is a hard task, and the lack of a fault model is one of the hardest theoretical barriers.
Code-coverage based fault models, deriving from the software testing field, may seem suitable to be applied on HDL descriptions. However, coverage metrics such as line/block coverage, branchkonditional coverage, expression coverage and path coverage lack of direct relationships with gate-level stuck-at faults, and their applicability in the field of test is difficult. Other considerable difficulties stem from the large amount of concurrency, from the complexity of timing schemes and fiom the combined presence of behavioral and structural description styles. But, definitely, the main problem with code-coverage based fault models is probably the lack of an explicit observability concept. Coverage metrics only consider reachability, that is like fault controllability in the gate-level domain. However, any ATPG should tackle faulty-behavior observation as well [ 5 ] .
[I61 presents TAO, a two-pass approach using a symbolic RTL test generator. The proposed testing paradigm involves writing path equations for modules, given the RTL connectivity, and solving them to obtain regular expressions for control paths.
Probably, the most successful proposal of a hardware-related high-level fault model is ObservabilityEnhanced Statement Coverage [7] . It introduces the concept of tag as the possibility that an incorrect value is computed at a given location, thus approximating the effects of fault propagation. Since this fault model does not assume any specific fault effect, its generality prevents explicit fault simulation.
The first ATPG exploiting Observability-Enhanced Statement Coverage was presented in [6] . The vector generation procedure is based on hybrid linear programming and Boolean satisfiability methods.
ARTIST, a different RT-level ATPG exploiting highlevel information to reach high code-coverage figures, was presented in [3] . Differently from [6] , ARTIST is a simulation-based approach. It is based on an evolutionary algorithm coupled with a commercial VHDL simulator, and due to the adoption of a commercial tool, it is able to produce sequences for general synthesizable VHDL description, with few limitations in complexity and characteristics, and it does not require any effort for re-modeling circuits or extracting special information. However, neglecting observability, sequences generated by ARTIST are not optimized for test purpose.
In [4] , ARTIST code-coverage metric was augmented with simplified observability. Fault-coverage figures dramatically increased, but the lack of a real fault model prevented the usage of a fault-dropping mechanism. ARTIST was given the goal to increase an observability measure, without meaningful stopping condition. Thus, the approach was not suitable for larger designs.
In [I] an extension of observability-enhanced statement coverage was proposed. In the new model, explicit RT-level assignment single-bit stuck-at's are used instead of generic tags. An RT-level assignment single-bit stuck-at fault is defined as a single-bit stuck-at in the effect of an RT-level assignment operation: when a fault is present, the affected object (signal or variable target of an assignment statement) loads the correct value, except for one bit that is forced to 0 or 1. Experimental figures show that this model is highly correlated with gate-level coverage.
In [2] shows a simulation techniques based on simulation command scripts that allows efficient exploitation of RT-level assignment single-bit faults. Using the Tcl interface of a commercial simulator, the simulation of each faulty circuit is shown no more costly than simulation of the original circuit. This paper presents new techniques for devising a high-level ATPG process. The proposed algorithm, described in the next sections, mixes a coverageoriented approach with fault-oriented optimizations. First, the RT-level circuit description is automatically analyzed to extract static structural information, control and data dependencies, and to group statements in basic-blocks. Then a code coverage approach is exploited to excite the RT-level assignment single-bit faults. After excitation, fault effect propagation and observation are tackled utilizing simulation scripts. In conclusion, a fault dropping phase is run to optimize the process.
ATPG System
Prince, the ATPG algorithm ( Figure I The CA in Prince evolves a population of p individuals with an offspring ratio o f p , in each generation. It implements a steady-state evolution, i.e., new individuals are first added to the population, and then the p fittest ones are chosen for survival. Individuals are selected for reproduction using their linearized fitness. With a probability p , the new individual is built mutat-ing a single parent: the original sequence can be shortened, or enlarged, or some bits may be flipped. Otherwise the new individual is built mating two sequences:
it can inherit the beginning from one parent and the end from the other, or some entire bit column from each parent. The GA evolves until the goal is reached, or until the maximum number mg of generations have been evaluated, or after m, generations without any fitness improvement in the best individual.
At the end of the process. sequences may easily be compacted with a simple algorithm. Next Sections details the process.
Fault Model
The RT-level single-bit stuck-at fault model was presented in [I] . In this model, a fault is defined as a single-bit stuck-at in the effect of an RT-level assignment operation: when a fault is present, the affected object (signal or variable target of an assignment statement) loads the correct value, except for one bit that remains stuck to 0 or 1.
Faults are single and permanent: only one fault is inserted at a time and the fault effect is present during the whole simulation. The RT-Level single-bit stuck-at fault model does not explicitly consider control-flow faults, such as stuck-at-true or stuck-at-false.
Initially, the Fault List contains the list of all RTLevel single-bit stuck-at faults. However, during synthesis the RT level VHDL description is optimized in order to create an efficient gate-level design. The optimization process analyzes the VHDL description and simplifies all logic eliminating redundancies. In this phase some RT-level stuck-at faults lose their correspondent gate-level faults. In order to prevent this discrepancy is necessary to identify which parts of the logic described at the RT-level disappear during the optimization phase of the synthesis process and to eliminate the associated faults from the Fault List.
To perform Fault Simulation a serial fault simulation strategy is adopted. The good and each faulty machine are simulated, comparing their outputs. A fault is marked as detected, if it produces a difference on a Primary Outputs of the circuit at the end of a clock cycle. To run the simulations, the Test Pattern is first transformed to a set of commands that force the correct waveform for input signals, and the Fault List is transformed to a set of script commands for injecting faults during simulation.
Fault injection is made possible by creating routines that change the target SignaVvariable bit value during simulation, using the simulator scripting language (Tcl), when a given target assignment instruction is executed. The fault injection procedures must face various issues derived from the fault model, from VHDL Semantics and from the simulator itself.
Further details can be found in [2].
Analysis
The analysis aims at building a simplified internal model of the circuit.
Static structural information, control dependencies and data dependencies are extracted. The RT-level hierarchy is analyzed and processes broken down into basic blocks, i.e., blocks of statements that are guaranteed to be always executed together.
Then a correlation matrix C is inferred mixing the control-flow analysis with data dependencies. Let , and , be two basic blocks, the element cy of the correlation matrix C estimates the conditional probability that , will be executed given the execution of ,.
The analysis is an automatic process performed through commercial tools for parsing VHDL. Each circuit has to be been analyzed only once, since information gathered during analysis does not depends on the results of the ATPG process.
Initialization
The initialization goal is to remove easy-to-detect faults.
Prince starts its GA to cultivate a sequence that maximizes the basic-block coverage. The initial population is generated randomly. The fitness function simply counts the number of covered basic blocks, without exploiting the knowledge of design structure. 
Basic-Block Fault Detection
After easy-to-detect RT-level faults have been removed, Prince starts the main ATPG process.
Let , be the i-th basic block, and ,the set of all still undetected RT-level single-bit faults on assignments performed inside ,. The basic-block fault detection stage is iterated for each non-empty ,.
Let , be a non-empty set of faults selected as target for the fault detection stage. Prince first tries to create a stance, more than 7,500 RT-level faults (67% of the total) are dropped in this stage. However, it should be sequence of-able to cover basic block ,. The search exploits the same GA described in 3. The fitness function counts each activated basic block a weighting it with its correlation with the target ,. See function (2) in Figure 2 .
If such a sequence a , ! is found, it is guaranteed that all statements of , are executed, however it is not certain that faults in , are excited the faulty value of the bit may be the same as the good one. Nevertheless, since a , ! is potentially a useful sequence and it is added to the final test set. When the CA is halted, the set of the p sequences in the last population is stored for later usage.
At this point, Prince start trying to detect all faultsJ;, of ,. Here, as in gate-level ATPG, "detect" implies exciting the fault and then observing it, by propagating its effect to a primary output. At the present, Prince exploits the simulation mechanism described in [2]: a loose interaction with a commercial VHDL simulator carried out through Tcl scripts. The new step still exploits the GA described above. The initial population is loaded from and a third fitness hnction is adopted. This fitness function measures how many faults in , have been detected, how close is the sequence to observe new faults and how many faults in ,have been excited. The three contributions are weighted in decreasing importance, thus observing is more important than exciting, and so on. See function ( 3 ) in Figure 2 .
Once a sequence 0 , : able to detect new faults is found, it is added to the final test set. Then all faults detected by 0; are removed from ,. It should be noted that removing detected faults from , leads to a change in the fitness function, because during fitness calculation only faults in , are considered. Finally, the whole population is re-evaluated and sorted according to the new criteria. Detection is iterated until , is empty or the GA aborts.
Fault Dropping
Each time a new sequence Is added to the final test set, it is simulated against all still undetected RT-level faults. The fault-dropping phase exploits structural information. Prince simulates fo get the list of covered basic blocks, and only the faults on covered statements are injected.
The speed-up archived by fault dropping is considerable. When Prince tackles the b14 benchmark, for inpointed out that it is the most time-consuming step of the algorithm. RT-level fault simulation, in general, is a resource-consuming task. In the proposed approach the circuit is not modified and does not need to be recompiled, however relying on an external commercial simulator introduces some overhead. Forcing new values during simulation, several RT-level faults cause overflows and boundary-check errors. Prince can handle all these exceptions, but it cannot handle them eficiently. When the simulator hangs, its process must be killed using unix signals and it must started again.
Experimental Results
To evaluate observability techniques, we analyze the ITC'99 VHDL RT-level benchmark circuits. Circuit characteristics are summarized in Table 1 . In the experiments, the population is composed of p=50 individuals, with an offspring ratio of p0=60%.
Thus, in each generation 30 new sequences are first generated, then selection is performed on the whole set of 50+30 individuals. The mutation rate was set to 0.3, hence in 30% of the cases, the new individual is built mutating a single parent, while in 70% of the cases the new individual is built mating two different sequences.
mg was set to 50, and m, was set to 10 for all circuits.
Experiments were run on a Sun Enterprise 250 running at 400 MHz and equipped with 2 Gbytes of RAM.
CPU times required range from some hours to two days and are mainly due to the lack of flexibility of the commercial RT-level simulator. The efficiency of the approach would greatly increase whenever a closer interaction with the simulation core will be available. Table 2 summarizes the results achieved by Prince on the benchmark circuits. The length of the test set (after compaction) is reported in the second column. Next column reports the fault coverage attained on RTlevel fault list. After generation, test sets were simulated against gate-level netlists and the stuck-at fault coverage is reported in the last column. Table 3 gate-level stuck-at fault coverage are compared with previous works (the Fault Coverage represents the percentage of detected faults in the fault list). Table 3 mentioning which fault list they are using. And these numbers are higher than the number of both collapsed and un-collapsed faults reported in [3] . For the sake of comparison, Table 3 Experimental results show that Prince is usually superior and at least comparable to both versions of ARTIST, the original one presented in [3] and the observability-enhanced one presented in [4] . It is remarkable that Prince was able to generate test sequences even for the larger benchmarks, while the observabilityenhanced ARTIST cannot tackle circuits bigger than b12.
~~

In
Compared to gate-level approaches, results are convincing. The attained Fault Coverage is higher than the commercial ATPG and, for b2 1, considerably higher than the state-of-the-art academic approach. Yet the few data presented in [12] prevent a more insightful comparison.
Benchmarks b12 and b15 are difficult even for gatelevel ATPGs, but they deserve some comments. They both need extremely long and specific test sequences to activate all functionalities (b12 implements a guess-asequence game, b15 a microprocessor). For the sake of performance, Prince was pushed to avoid such a long sequences and this choice may have penalized it. In fact, even the statement coverage figures for the two benchmarks are quite low: 68.72% for b12 and 63.04% for bl5. More experiments are being performed to better understand this behavior.
Conclusions
Due to the wide adoption of logic synthesis tools, RT-level ATPG techniques are increasingly necessary in order to shift test-related activities towards the description level adopted by designers. A crucial point for developing effective high-level ATPGs lies in the identification of a suitable fault model, which should guarantee a good correlation with gate-level fault coverage figures while allowing the implementation of an ATPG algorithm. This paper presented Prince, an algorithm for implementing a high-level ATPG exploiting code coverage-oriented approach with fault-oriented optimizations.
Prince adopts a fault model at the RT-level that enables efficient fault simulation and guarantees good correlation with gate-level fault coverage.
Experimental results showed that Prince is broadly applicable, and it attains fault coverage figures usually superior and at least comparable to other RT-level approaches. Also compared to gate-level approaches, results are considerable. The two cases in which the approach is less satisfactory were analyzed and are currently under a deeper study.
Acknowledgments
