Automatic Test Pattern Generation (ATPG) is an important task to ensure that a chip functions correctly. For high speed chips, testing for dynamic fault models such as the path delay fault model becomes more and more important. While classical algorithms for ATPG reach their limit, the significance of algorithms to solve the Boolean Satisfiability (SAT) problem grows due to recent developments of powerful SAT solvers. However, ATPG is not always a purely Boolean problem. For generating robust test patterns for delay faults, multiple-valued logics are needed. To apply a (Boolean) SAT solver on a problem modeled in multiple-valued logic, a Boolean encoding has to be used.
Introduction
Ensuring the correctness of a chip is an important task before being delivered. Every chip has to pass a post production test, where the correctness is checked by applying test patterns that are generated by Automatic Test Pattern Generation (ATPG) algorithms. Due to the increased process variability, defects leading to timing violations are becoming dominant in modern chips. Such delay defects are tested by using dynamic fault models such as the Path Delay Fault (PDF) model [1] .
A test for a PDF is a two pattern test, i.e. consists of a pair of test vectors v 1 , v 2 . The first test vector v 1 sets the initial value and the second test vector v 2 launches the desired transition, which can be either rising or falling. The transition is then propagated along a structural path to an output, where it can be observed, whether the acumulated delay along the path violates any timing constraints. * Parts of this research work were supported by the BMBF in the Project MAYA under contract number 01M3172B and by DFG grant DR 287/15-1.
In [2] , tests for PDFs have been classified into two different categories: robust testable and non-robust testable. Because robust tests guarantee the detection of delay faults in the presence of other delay faults, they are more desirable but also harder to obtain in terms of complexity. In the last decade, powerful engines to solve the Boolean satisfiabiliy (SAT) problem have been developed. Modern SAT solvers [3] [4] [5] [6] incorporate techniques such as conflict-based learning, nonchronological backtracking and efficient search heuristics. Due to their efficiency, SAT solvers serve as a core engine for many problems in the field of ComputerAided Design such as verification and ATPG.
SAT-based ATPG for PDFs was first introduced in [7] . In this approach, a 7-valued logic proposed in [8] with a fixed Boolean encoding was used to generate robust tests for PDFs. However, this approach was restricted to combinational circuits. Sequential behavior could not be modeled using this logic. Due to that, this approach is hardly feasible for today's circuits. Other approaches were presented, where SAT techniques were applied to path sensitization [9] and non-robust and semi-robust test [10] , respectively. These approaches made only use of Boolean logic and cannot model static values that are necessary for generating robust tests.
In [11] , robust test generation for PDFs in circuits with unknown values and tri-state elements was performed. A 19-valued logic and derivative logics with a smaller number of values were developed. The Boolean encoding for these logics was chosen randomly. However, in [12] , it was shown, that the chosen encoding has significant influence on the SAT instance and therefore on the performance of the SAT solver. A detailed study on the influence of the chosen encoding in SAT-based ATPG for stuck-at faults is presented in [13] . But in this approach, only Boolean encodings for a four-valued logic are analyzed.
In this paper, the influence of the Boolean encodings for the 19-valued logic and its derivatives on the performance of the overall test generation algorithm are studied. Furthermore, the relation between the size of the SAT instance and the performance of robust test generation is evaluated and a method to identify efficient encodings is shown. Representative encodings have been developed and integrated into the test generation algorithm. Experimental results on ISCAS benchmarks and industrial circuits containing multiple-valued logic are provided to show the impact of the chosen Boolean encoding.
This paper is structured as follows. In the next section, the basic concepts of SAT-based ATPG for PDFs and the usage of Boolean encodings are explained, whereas alternative Boolean encodings are discussed and analyzed in Section 3. Experimental results are presented in Section 4 and conclusions are drawn in the last section.
SAT-based ATPG for PDFs
In this section, the basic concepts of robust SATbased ATPG for PDFs are briefly reviewed. For more details, we refer to [2] for the PDF model and to [11] for the SAT formulation. Section 2.1 introduces the PDF model with respect to robust and non-robust test classification, whereas in Section 2.2 the SAT formulation of robust test generation is presented. In Section 2.3, the usage of Boolean encodings is explained.
Path Delay Fault Model
The PDF model describes a distributed delay fault on a path from a (pseudo-)primary input to a (pseudo-) primary output of a circuit C. To detect such a fault, a transition which is either rising or falling is propagated along the path. Therefore, a PDF F can be defined as a tuple F = (P, T ) where P is a sequence of gates g 1 ,...,g n with input g 1 and output g n . The type of transition at g 1 is given by T . Note, that the transition is inverted after passing an inverting gate on the path, e.g. NAND or NOR. A test pattern which detects a PDF contains two test vectors v 1 , v 2 that are applied in two consecutive time frames t 1 , t 2 . The test vector v 1 sets the initial value in t 1 , whereas v 2 launches the transition in t 2 . In case of a delay fault, the transition arrives at g n not in the specified time, i.e. a timing violation occurs.
The task of ATPG for PDFs is to generate such two test vectors which detect a potential delay fault on P . According to [2] , there exist two different categories of tests for PDFs: robust and non-robust. The difference between robust and non-robust tests is that robust tests guarantee the detection of a PDF even when other delay faults are present. This cannot be guaranteed by applying non-robust tests. Here, other delay faults can mask the considered PDF. From the technical point of view, both fault models differ in the constraints at the off-path inputs of P . An off-path input is defined as an input of a gate g i on path P , that is not g i− 1 .
The constraints at the off-path inputs are presented in Table 1 . The value X1 (X0) on an off-path input signifies, that the final value in t 2 has to be 1(0), whereas S1 (S0) means, that both the initial value in t 1 and the final value in t 2 have to be 1(0) and no hazards occur between them, i.e. the signal is static.
SAT Formulation: Robust Tests
SAT solvers are working on a problem represented in Conjunctive Normal Form (CNF). A CNF Φ is a conjunction of clauses, whereas a clause is a disjunction of Therefore, the problem must be converted to CNF. For a circuit C and a PDF F = (P, T ), the CNF Φ F can be obtained by the following formula: Φ F = Φ C · Φ T · Φ P , where Φ C is the characteristic function of C, Φ T forces the transition and Φ P sets the corresponding constraints at the off-path inputs of P . The characteristic function Φ C is the conjunction of the characteristic function of each gate g in the circuit and can be obtained by
The derivation of Φ g depends on the applied logic. If Boolean logic is used, Φ g can be easily derived using the method proposed in [14] . In case of a higher-valued logic L m with m values and m > 2, a Boolean encoding is needed that maps each value into the Boolean domain. (cf. Section 2.3). To obtain a minimized CNF representation, the logic optimizer ESPRESSO of the SIS package [15] is used.
According to [11] , for robust test generation in Boolean circuits a six-valued logic L 6 is needed, whereas for industrial circuits containing multiplevalued logic, a 19-valued logic L 19 has to be used. Because a higher-valued logic typically results in more complex SAT instances and not all values of L 19 can be assumed by each signal line, an algorithm was proposed which uses the higher-valued logics only where necessary. For most parts of the circuit, a lower-valued logic is sufficient. In this way, the size of the SAT instance is reduced (see [11] for more details).
Usage of Boolean Encodings
To apply a Boolean SAT solver to a multiple-valued ATPG problem, e.g. generating robust test patterns for path delay faults, the problem has to be transformed into a Boolean problem. This can be done by using a Boolean encoding η for each value in the multiplevalued logic L m . The minimal number of Boolean variables n needed to encode this value depends on the number of values of L m and is defined as follows: n = log 2 |L m | . The following study is restricted to these logarithmic encodings.
Consequently, three variables are needed to encode both logics L 6 and L 8 , whereas L 11 and L 19 has to be encoded by four and five Boolean variables, respectively. More formally, the Boolean representation of signal s whose value is defined over L m is given by x
..a n of x 1 ...x n where a i|1≤i≤n ∈ {0, 1}. The complete Boolean encoding η for L m is defined as follows: 
Analysis of Alternative Boolean Encodings
The Boolean encoding used in the initial approach was chosen rather randomly and its efficiency was not analyzed against other encodings. In this section, alternative Boolean encodings are discussed and analyzed.
Already for the CNF generation of a circuit modeled in L 6 , one Boolean encoding out of 8!/2 = 20160 has to be chosen. The number of potential Boolean encodings increases with the increasing number of values of the logic. For L 8 , there are 8! = 40320 Boolean encodings, whereas for L 11 , there are more than one billion. Testing all possible encodings and selecting the most efficient is therefore not feasible. Some preselection must be done to identify efficient encodings. For studying the impact of the encodings, inefficient encodings have to be determined, too. Note, that preliminary experiments have shown that due to the small number of gates that have to be modeled in L 19 , the change of the Boolean encoding of L 19 had nearly no impact on the run time. Therefore, Boolean encodings for L 19 are not discussed in the following.
Compactness of Boolean Representation
Typically, but not necessarily, a larger SAT instance results in higher run times of the SAT solver. Moreover, in the field of SAT-based ATPG, the SAT solver has to cope with thousands of smaller instances. Although the complexity of building a SAT instance is of linear size, the overhead is not negligible in the overall run time. Therefore, a Boolean encoding with a compact CNF representation is likely to perform well whereas a Boolean encoding with a large CNF representation has probably a poor performance. Each gate type has a different CNF representation and a preliminary evaluation has shown that one single Boolean encoding may produce a compact representation for one gate type, whereas for other gate types (e.g. busdriver), it may be contrary. Due to the fact, that most gates in a circuit are primitive gates, e.g. AND, OR, and not higher-level, e.g. busdriver, we concentrate only on the size of the CNF representation of primitive gates in the following. Primitive gates have also the advantage that the size of their representation is very similar for a specific encoding.
Below, the CNF sizes of the Boolean encodings are analyzed. The compactness of the Boolean representation of each encoding e is denoted as C e and is defined as a tuple (|cls| , |lits|) that contains the accumulated number of clauses (cls) and the accumulated number of literals (lits) of the gate types AND and OR. The accumulation was done to obtain a good ratio of the compactness of both gate types.
The distribution of the compactness values of all possible Boolean encodings for L 6 and for L 8 are shown in Figure 1 and in Figure 2 , respectively. The Most Compact Encodings (MCE) of L 6 have 32 clauses and 88 literals (accumulated for AND and OR), whereas the largest encodings have 67 clauses and 247 literals, which is more than two times the size of the MCEs; concerning the number of literals even nearly three times. The difference between most compact and largest encoding increases considering L 8 . Here, the number of clauses (97) in the largest encoding is 2.6 times the most compact one (38) and the number of literals (448) 3.8 times larger than the most compact one (118).
Due to the very high number of possible encodings for L 11 , the range of the compactness values for the encodings of L 11 is determined with a simplified method. The compactness values of only those encodings of L 11 It can be concluded that the chosen Boolean encoding has -independently from the logic used -an enormous impact on the size of the SAT instance. The usage of compatible encodings only, however, sets tight constraints on the usage of Boolean encodings and prevents the joint usage of the MCEs of each logic.
Efficiency of Most Compact Encodings
The size of the SAT instance is only one indicator for the efficiency of a Boolean encoding. Therefore, the MCEs of each logic are investigated according to their run time for a single circuit. To avoid influences from other encodings, the circuit must be modeled by only one single logic, i.e. either L 6 , L 8 or L 11 . The ISCAS '85 circuit c6288 representing a 16-bit multiplier was chosen to test the efficiency of the MCEs of each logic. All structural paths with a length of over 40 gates were identified and were set as targets (rising and falling) for robust test generation. This results in 3200 ATPG calls for each encoding.
The tests were carried out for each logic on a Dual Dual-Core Xeon (3000 MHz, 32768 MByte RAM) running GNU/Linux. In each of the three runs, the circuit is modeled completely with L 6 , L 8 and L 11 , respectively. For each logic, a set containing the MCEs is identified and for each encoding in the set, robust test generation was executed. In Table 2 , statistical data and the overall results of the runs are given. The first column gives the logic, whereas in the next column, the number of runs are denoted. The third column presents the compactness values of the chosen encodings. In the following columns min, av. and max, the smallest, the average and the highest run time, respectively, are given in CPU seconds. In Figure 3 , however, the run time distribution is shown for each logic in logarithmic scaling. The run times for each logic were sorted. The value on the x-axis defines the position in the sorted list and the value on the y-axis gives the run time in seconds. The upper curve denotes the run times of the MCEs of L 11 , whereas the middle curve and the lower curve give the run times of the MCEs of L 8 and L 6 , respectively.
For L 6 even the MCEs differ strongly regarding the run time behavior. The highest run time is over 4 times the minimal one, although they have equal compactness values. The range is even higher for the highervalued logics L 8 and L 11 . The highest run time for L 8 is eight times the minimal run time for L 8 , whereas for L 11 , the highest run time is nearly 16 times the minimal run time. While the curve of L 6 is increasing only very smoothly, the curves of L 8 and L 11 are more steep, suggesting that encodings of L 8 and L 11 have to be chosen more carefully. Note, that those encodings having the minimal run time for each logic are denoted as Most Efficient Encodings (MEE) in the following.
The application of the MCEs for robust test generation shows that, first, equal compactness values do not guarantee the same run time behavior, and second, the impact on the efficiency increases with a higher-valued logic.
Experiments
In this section, alternative Boolean encodings are experimentally evaluated. First in Section 4.1, four experiments with representative Boolean encodings are described. The experimental results of these encodings are shown in Section 4.2.
Encoding Selection
In this section, Boolean encodings are created to determine the influence of the ATPG run time. The compactness values of each encoding can be found in Table 3 . Note, that in the following an encoding refers to a set of compatible encodings for each logic rather than to a single encoding if no logic is explicitly named. Four different experiments are described below:
• Experiment 1 shows the behavior of two encodings from which one is likely to be very efficient, whereas the other is probably inefficient. For this, a compact encoding η L6com (MCE of L 6 ) and a large encoding η L6lar are chosen. Note, that the encoding of L 6 was first created and the compatible encodings are selected afterwards. Here, the most compact and the largest encodings, respectively, are selected among the compatible encodings. If not mentioned otherwise, this is the standard flow of choosing compatible encodings. Furthermore, an encoding η L6med of medium size (applied in [11] ) is selected.
• Experiment 2 shows the influence of the encoding selection for L 11 on the ATPG performance. For this purpose, a compact encoding η L11com (MCE of L 11 ) is created. Next, an encoding set η L11lar is generated such that the encodings for L 6 and L 8 are equal, but instead of choosing an MCE of L 11 the largest compatible encoding is selected.
• In Experiment 3, the influence of the encoding selection for L 8 is investigated. First, a compact encoding η L8com is generated. Then, an encoding set η L8lar is created containing the same encoding for L 6 , but has different encodings of L 8 and L 11 . Note that possible differences in run time cannot clearly be dedicated to the encoding of L 8 , because the encoding for L 11 also differs.
• In Experiment 4, the MEEs of each logic are evaluated for all circuits. This is to show that, for receiving a good overall performance, it is not sufficient to use an encoding optimized for one logic only. Therefore, the encodings η L6mee (MEE of 
Experimental Results
In this section, the results of the four experiments are presented. The experiments were carried out on a AMD64 4200+ (2200 MHz, 2048 MByte RAM) running GNU/Linux. The program was implemented in C++ and the SAT solver MiniSat 1.14 [6] serves as core engine. As benchmarks, ISCAS '85 circuits and industrial circuits provided by NXP Semiconductors Hamburg, Germany were used. The name of the p-circuits roughly denotes their size, e.g. circuit p1330k has about 1.3 million gates. More statistical data about the circuits is given in Table 4 . For each circuit the number of inputs (column #PI ), the number of tri-state elements (column #Tri) and the number of flipflops (column #FF ) are shown. Furthermore, the percentage of gates modeled in L 11 and L 8 are given in column %L 11 and column %L 8 , respectively. The number PUT  c1908  33  0  0  0  0  2264  c2670  157  0  0  0  0  1400  c3540  50  0  0  0  0  3700  c5315  178  0  0  0  0  4340  c6288  32  0  0  0  0  3200  c7552  206  0  0  0  0  6360  p44k  739  0  2175  0  0  20000  p49k  303  0  334  0  0  12390  p57k  8  0 of paths for which robust test generation is executed is presented in column #PUT. As test targets, only paths with a length of over 40 gates are selected. The maximum number of test targets was set to 20100. The paths are chosen randomly, but to avoid testing paths of a small part of the circuit only, at least one path starts at every input (if such a long path exists).
In Table 5 , the results of the selected encodings are shown. Time is measured in minutes (m) and hours (h), respectively. The timeout for each target was set to 20 seconds, whereas the timeout for each ATPG run was 20 hours. The minimum run time of all encodings is marked bold for each circuit.
In Experiment 1, it is shown that the influence of the Boolean encoding is significant. The run time for the large encoding η L6lar dramatically increases up to a factor of 56 (p80k) compared to η L6com and up to a factor of 44 compared to η L6med . In five out of eleven industrial circuits, η L6lar even reaches the limit of 20 hours. Therefore, η L6lar is not feasible for industrial practice. Comparing η L6com and η L6med , the compact encoding is in most cases only slightly better than η L6med and in two cases (c6288, p57k) even worse.
In Experiment 2, the influence of the chosen encoding for L 11 is evaluated. In those circuits with no or only few parts modeled in L 11 , the run times are the same or even slightly better using the large encoding η L11lar . In circuits with higher precentage of L 11 , the maximum overhead is about 25% of run time (p1330k).
In Experiment 3, the influence of the chosen encoding for L 8 is investigated. The results are similar to those of Experiment 2, but the impact on the run times is scaled up. For p456k, where nearly two-thirds of the circuit is modeled in L 8 , the run time is increased by a factor of 2.9 and for p57k a timeout occured.
In Experiment 4, the MEEs of each logic are investigated for all circuits. Encoding η L6mee being MEE of L 6 is also the most efficient encoding for all other IS-CAS circuits. But for the industrial circuits, η L6mee provides the smallest run time only for p49k (completely in L 6 ) but not for any other circuit. Compared to each other, η L8mee2 has an advantage over η L8mee1 for the smaller circuits (with lesser percentage of L 8 ), whereas η L8mee1 is better for the larger ones and therefore preferable. Encoding η L11mee (MEE of L 11 ) has only minimal performance gain for those circuits with a large portion of L 11 (e.g. p177k, p456k). This experiment shows that the usage of an encoding which is optimized for one logic only is not optimal due to the different logic modeling of the circuits. Therefore, we propose the combination of multiple encodings depending on the percentage of the used logics. For instance, the combination of η L8mee1 /η L8mee2 and η L6mee , where η L6mee is applied if the percentage of L 8 in the circuit is lower than 25% (η L8mee1 /η L8mee2 otherwise), would provide the best result for 11 out of 17 circuits and is therefore more robust.
Conclusions
The influence of Boolean encodings in SAT-based ATPG for PDFs in which a set of multiple-valued logics is applied, has been studied in detail. First, it is shown that the size of the SAT instance strongly depends on the chosen encoding. Moreover, the effect increases the more values the logic has. Also, it is pointed out that the compactness of a Boolean encoding is only an indicator for the performance. Experiments have shown that the performance for encodings with equal size can vary by a factor of 4 for a lower-valued logic and by a factor of 16 for a higher-valued logic.
Representative encodings have been developed and their influence has been evaluated on ISCAS '85 circuits as well as on industrial circuits. According to the results, the influence is significant and the Boolean encodings have to be chosen carefully to avoid poor performing test generation. Moreover, it has been highlighted that the performance of the encoding is highly influenced by the logic modeling of the circuit. Thus, the best result is obtained by a combination of different encodings according to the circuit's logic modeling. It could be also observed that the usage of only compatible encodings sets tight constraints on the selection process. Therefore, studying the usage of different encodings in detail and the application of incompatible encodings is future work.
