Abstract-In this work, based on the concept of test pattern broadcasting [1], [2], we propose a new core-based testing method which gives core users the maximum level of test freedom. Instead of only using the test patterns delivered by core providers, core users are allowed to broadcast their own test patterns to the cores of a SoC (system on chip) design for parallel scan testing. The fault coverage of each core test, using test patterns developed by any core user, can be evaluated by an enhanced version of a traditional fault simulator. The netlist of each core is scrambled before it is delivered to core users, thus the netlist will not be revealed. The enhanced fault simulator of a core has the capabilities of decoding the scrambled netlist, and performing fault simulation for the test patterns provided by each of the core users. For each core, both random test patterns (applied by a core user), and golden test patterns (delivered by the core provider) jointly achieve high and flexible fault coverage requirements. The enhanced logic simulator of each core can also decrypt the scrambled netlist, and perform logic simulation with the objective of generating fault-free test responses for signature analysis (for example). The proposed method has the advantages of minimizing the number of scan pins, reducing the test application time, and achieving the maximum level of test quality control by core users. Simulation results demonstrate the feasibility of this method.
1 The singular and plural of an acronym are always spelled the same.
PST
Pattern Suggestion 
B
ASED on the concept of test pattern broadcasting, we have presented a new core-based test generation approach for random logic core circuits. The basic idea is to broadcast a circuit module's (UDL or core) deterministic test patterns to all other circuit modules (UDL and cores), such that all circuits can be tested simultaneously by sharing the test patterns. The fault coverage of each core by applying test patterns not delivered by the core provider can be evaluated by its enhanced fault simulator. The enhanced fault simulator can read the encrypted netlist of the core such that the goals of netlist protection and fault coverage analysis can be achieved. The idea of test pattern broadcasting can be recursively applied to all circuit modules until they have been tested with satisfactory fault coverage, or most faults remaining undetected are random-pattern-resistant faults. The fault coverage of each core can be further increased by selecting a subset of deterministic test patterns delivered by the core provider.
The proposed core-based test method has the advantages of low hardware overhead (virtually single scan chain architecture), high fault coverage (using hybrid random test patterns and deterministic test patterns), and small test time (parallel testing). The most significant benefit is that core users can achieve the required fault coverage of each core by applying test patterns not provided by the core vendor. Thus, each core user can apply a suitable number of test patterns (if possible) for a core based on the test budget allowed. The encryption and decryption times of a core netlist might be large if the netlist length and the key length are both large. As demonstrated, CPU times for encryption and decryption can be highly reduced (and also the netlist can be better protected) by using a mixture of different coding methods.
This work mainly deals with the single stuck-at fault model as far as fault coverage is concerned, though the methods of test pattern broadcasting and netlist encryption and decryption may be extended to the domain of IDDQ testing and delay testing. For a SoC design, not only do the cores themselves need to be tested, but also the wires and the noncore circuitry between the cores need to be tested. Interconnects between cores can be tested by many different techniques [25] , [26] , and built-in selftesting can be efficiently applied to deep sub-micron high-speed wires [27] , [28] . A typical IC often contains multiple cores with built-in boundary scan design, as well as significant amounts of noncore logic which do not have any built-in test access mechanism. The noncore logic can be efficiently tested with the aid of the core. For example, in [29] , a hierarchical test access architecture has been proposed to integrate the cores and the noncore logic for testing. We will extend the proposed approach to cover the noncore logic by test pattern broadcasting, and this will be part of our future research.
II. INTRODUCTION
To shorten the product development cycle for an integrated circuit or system, pre-designed cores or intellectual properties (IP) are increasingly being used. As shown in Fig. 1(a) , a SoC design generally consists of many cores (such as microprocessors, memories, and DSP circuits), and user-defined logic circuits (UDL). The structural information of each hard core is not provided because it is proprietary. As a result, for testing these cores, core-users must rely on core-providers to offer testing information, so that all cores embedded into a large circuit can be well tested after the IC manufacturing process [3] , [4] . This work mainly concentrates on testing hard cores using the concept of test pattern broadcasting [1] , [2] .
From the aspect of core-based testing, core providers are responsible for delivering (1) the design-for-testability (DFT) hardware inside each core, (2) the test patterns of each core, and (3) the validation of those test patterns. Several problems arise when testing the pre-designed cores, e.g., how to apply test patterns to each core and how to test the user defined logic around the core. Test access mechanisms (TAM) and test wrappers have been proposed as important components of a core-based test architecture. TAM deliver test sequences to cores on the SoC and bridge the physical distance between test source and each core, as well as between each core and test sink [3] , [4] . Test wrappers translate the test sequences into patterns which can be applied directly to the cores. Each test wrapper is in fact a thin shell (the shaded area for each core in Fig. 1(a) ) around the associated core that connects the TAM(s) to the core. Wrappers may also provide width adaptation in case of a mismatch between core width and TAM width. One simple solution for TAM design is to place a boundary scan chain around each embedded core so as to provide access for the core testing [5] . For example, if each core in Fig. 1(a) has a boundary scan chain as shown in Fig. 1(b) , then each core can be tested by shifting in a set of golden patterns from the scan chain. Here, a golden test pattern of a core represents a test pattern delivered by the core provider, although it is traditionally used to represent a test pattern generated by a golden device (known-good device). The drawback of the boundary scan scheme is the prohibitively long test application time, which has become one of the major costs for testing. Other solutions include multiplexed direct parallel access [6] , TestRail [7] , partial isolation rings [8] , and addressable test ports [9] . With respect to core wrapper design, the architecture of test collar has been proposed using variable-width buses for test data and control [10] . In [7] , Test-Shell wrapper was proposed, and the wrapper is scalable and supports the operating modes required by the IEEE P1500 standard [11] . Test wrapper and TAM design is of critical importance in core-based integration because it directly impacts the automatic test equipment's (ATE) memory organization, and test application time. Recently, TAM design and wrapper optimization are considered in conjunction to minimize the test time or test power dissipation [12] .
Test application time reduction for scan design has been thoroughly researched [13] . Recently, the concept of test pattern sharing (for multiple circuits or modules) has been proposed by a novel scan structure which significantly reduces the test application time [1] . By the technique proposed in [1] , it is assumed that the modules to be tested are independent. Traditionally, scan flip-flops of different modules are chained together as in Fig. 2(a) , and the overall scan depth equals the summation of the scan depth in each module. Note that the scan depth of a module is the total number of scan flop-flops in the module. The new scan test structure allows test vectors to be broadcasted to all modules as shown in Fig. 2(b) , and therefore the overall scan depth can be dramatically reduced to the maximum scan depth among all modules. Based on this test architecture, the concept of virtual circuit is developed for the automatic test pattern generation (ATPG) process. It has been reported that, generally, the test length (i.e., the number of test patterns) for the new scan structure (Fig. 2(b) ) is only slightly increased when compared to that of Fig. 2(a) [1] . As the scan depth is dramatically reduced with a slight increase in the test length, the total test time for scan testing can be significantly decreased. According to the experiment conducted in [1] , a significant amount of total test time reduction can be achieved in most cases, when compared to the conventional single scan chain technique. Note that the output responses in Fig. 2(b) can be compressed into a multiple input signature register (MISR) for analysis, though this is not shown in the figure. In [2] , the technique of parallel serial full scan (PSFS) has been proposed based on the concept of test pattern sharing for testing a single full scan embedded core by breaking the single scan chain into multiple (shorter) scan chains. The PSFS test structure is later known as the Illinois Scan Architecture (ILS). As PSFS deals with test pattern sharing for the same circuit core, the number of scan chains, the distribution of the memory elements to the scan chains, and the order of the memory elements in each scan chain are all design parameters. The total test application time and test data volume can be reduced by 85% and 67% respectively [2] .
The key success of [1] , [2] is that, while one module is tested by its test patterns, the same test patterns can be applied simultaneously to other modules in the manner of random testing. However, such a powerful technique cannot be directly applied to core-based testing, unless the core provider gives more information about the circuit structure of each core. In Fig. 2 (b), suppose CUT(1) is an UDL while CUT(2) is a reusable core whose test patterns and expected output responses have been provided. Now, if the test patterns for CUT(1) are broadcasted to CUT(2), the fault coverage of CUT(2) cannot be obtained, because the circuit structure of the latter is unknown. Of course, we can broadcast the test patterns of CUT(2) (i.e., the golden test patterns delivered by its core provider) to CUT(1) whose structure is known. But, the problem still cannot be solved if both CUT(1) and CUT(2) are reusable cores. In summary, the concept of test pattern broadcasting cannot be directly applied to core-based testing, unless core users can measure the fault coverage of each core when a set of nongolden patterns is applied. Further, core users must have a method to select a subset of golden test patterns to achieve the required fault coverage for each core. By applying the concept of test pattern broadcasting, the single scan chain test architecture shown in Fig. 1 (a) can be modified to Fig. 3 . Again, all test responses are assumed to be compressed into a MISR.
Giving only the golden test patterns of a core to its users can unnecessarily limit their flexibility in testing the core. However, for proprietary reasons, the circuit netlist of the core must be hidden from core users. A possible solution is to allow core users to access a secure computer provided by the core vendor for fault simulation [17] . In this paper, we propose a novel way of providing information to core users so that they can apply any set of test patterns to each core, and can also obtain the fault coverage of the core without knowing its structural netlist. The basic idea is to provide core users with a fault simulator, which is able to evaluate the fault coverage of the core (after the core has been tested by nongolden test patterns). To hide the circuit structure of the core from core users, the circuit netlist of the core is scrambled so core users cannot read it. However, the fault simulator delivered by the core provider can decode the netlist and perform the fault simulation process. Further, a table containing a set of golden test patterns, and the corresponding faults detected by each test pattern, is also delivered. With this table, core users can select a subset of golden test patterns to achieve specified fault coverage for the core, if the nongolden test patterns cannot accomplish satisfactory fault coverage. An enhanced logic simulator must also be provided by the core provider to generate the fault-free output response for each test pattern. The fault-free output responses are used to simulate the digital signature for the MISR designed by a core user. Again, the enhanced logic simulator can read the scrambled netlist for logic simulation. The benefits of the new core-based test method are two fold: (1) the test application time for testing the core can be highly reduced, and (2) core users have freedom in accomplishing the required fault coverage for each individual core.
The organization of this paper is as follows. Section III reviews the techniques of test pattern broadcasting and core intellectual property protection for fault simulation. Section IV describes the proposed core-based test generation approach, the encryption and decryption algorithms for netlist scrambling, and the overall structure of the test scheme. In Section V, we give a complete example to describe the methods discussed in Section IV including the application of the approach to P1500. Section VI shows the simulation results for test pattern broadcasting as well as netlist encryption and decryption.
III. BACKGROUND
The key idea of test pattern broadcasting in [1] is to consider all circuits driven by scan chains as a single circuit when executing the ATPG process. For example, given circuits, inputs of these circuits can be connected by a heuristic method. The connected circuits are then treated as a single one (called a virtual circuit) for the ATPG process. The ATPG process applied to the virtual circuit is extremely powerful in test length reduction, because the virtual circuit potentially provides the maximum degree of test compaction. Traditionally, the ATPG process is individually applied to each circuit to generate test sets for these circuits, and then a test compaction process is applied to compress these test sets. It is very interesting to find that the test set obtained by using the virtual machine has much smaller test length than the test set obtained by test compaction. The success of [1] is to identify this fact and to organize all circuits under test as a single one for the ATPG process. From the research results of [1] , it can also be concluded that there is room for test compaction to further reduce the test length. The work of [1] matches the multiple-core testing environment exactly, as the cores in a SoC circuit are independent.
On the contrary, the work of [2] deals with testing a single full scan embedded core and matches the single-core testing environment. A heuristic has also been proposed for computing an optimal scan chain configuration that produces a minimal test application time. To test faults which cannot be detected by this test architecture and faults masked by the MISR, the single scan chain architecture (for applying serial patterns) is preserved using extra multiplexers. Recently, a reconfigurable technique is proposed to alleviate the problem of applying serial patterns [14] . The basic idea is to organize two versions of Illinois scan architecture, ILS-and ILS-, where is used to determine the scan depth of each of the multiple scan chains. As and are relatively prime, signal constraints introduced by the basic ILS structure can thus be alleviated. This technique significantly reduces both the test application time and the test data volume by the basic ILS design. Other papers which use the idea of breaking a single scan chain to multiple scan chains for test time and test data reduction can also be found in [15] , [16] .
Fault simulation for a core needs the netlist of the core; however, the netlist of the core is not allowed to be revealed. In [17] , instead of providing a test set or netlist of a core, the core's vendor allows core-users to make query/request to their CAD procedures. Each query/request is processed on a secure computer approved by the core provider. Thus, the core's proprietary information such as its layout and netlist are encapsulated in core-level procedures which are executed on a secure computer. Fault simulation can be performed by querying each core for nonproprietary information such as (a) core output response for an input vector, and (b) erroneous output response for a target fault under a specific input vector.
IV. CORE TESTING BY TEST PATTERN BROADCASTING
To employ the test pattern broadcasting technique, it is necessary to ascertain the fault coverage of testing a core using the broadcasted patterns. This is traditionally done by fault simulation. Unfortunately, the structure of the core is often hidden from the users for proprietary reason, which makes the conventional fault simulation inapplicable. To solve this problem, we propose a novel method to facilitate the testing of cores using the broadcasting architecture, while still preserving the confidentiality of each core's content. The basic idea is as fallows. First, the gate-level netlist of a core is encrypted as an ordinary file. The encrypted netlist is then sent to core users. Because a typical fault simulation does not work with the encrypted netlist, the core provider must supply an enhanced version of the fault simulator which takes the encrypted netlist deftly for simulation. By this way, we not only make the core available in terms of testability, but also preserve the confidentiality of the core because its netlist has never been disclosed to users.
The greatest benefit of the proposed core testing approach is to give the maximum flexibility to different core users. A successful core will be used by many applications which might have extreme test requirements. If the fault coverage of test patterns delivered by the core provider is too high for a specific application, then the test cost is unnecessarily high. On the contrary, if the fault coverage of test patterns supplied by the core provider is too low for an application, then the test quality cannot be guaranteed. Our objective is mainly to improve the core test flexibility for core users, and core providers must be able to support: 1) a netlist hiding technique for core protection, 2) an enhanced fault simulator for fault coverage analysis, 3) a pattern suggestion table for fault coverage enhancement, and 4) an enhanced logic simulator for expected output derivation.
A. The Hiding Technique for Netlist Structure
As a commercial fault simulation tool usually requires the gate-level netlist as its input, we must hide the gate-level netlist for the fault simulator. The gate-level structure is usually written in a file as a HDL form. The netlist file provided to a core user cannot be "readable" by the core user. There has been much research attempting to scramble and unscramble a text file such as DES (Data Encryption Standard ) [18] - [20] , and FEAL (Fast Data Encipherment Algorithm) [21] . These methods allow the scrambled file to preserve the confidentiality of each core, as its netlist is never disclosed to core users. Note that the fault simulator (executable code) delivered by its core provider can decode the scrambled netlist from which the circuit netlist can be created by the fault simulator. In this research, we adopt the RSA encryption and decryption algorithms described in [22] .
The basic idea of hiding a netlist is to first break the long netlist into a series of blocks, and to represent each block as an integer between 1 and using any standard representation. Thus, each block contains a piece of circuit netlist called . The purpose here is to get the netlist into a numeric form necessary for encryption. To encrypt , we raise it to the th power modulo , i.e., the result (the cipher-text ) is the remainder when is divided by . After each block has been decoded into , the netlist is scrambled into another file and can be shipped to core users. In the enhanced fault simulator, there must be a routine to decrypt the scrambled netlist. The basic idea is to first raise each cipher-text to another power , again modulo . In summary, the encryption and decryption algorithms and are: for each netlist block , and for each scrambled block . The remaining problem is how to determine the values for , , and . The first step is to compute as the product of two very large primes and :
We then pick the integer to be a large, random number which is relatively prime to . The integer is finally computed from , , and to be the multiplicative inverse of , modulo . Thus, we have
It has been proven in [22] that this guarantees that the decryption and encryption relation holds. We emphasize again that the function is performed by a core provider to scramble the netlist file, while is embedded in the enhanced fault simulator to recover the scrambled netlist back to its original form for further processing. Thus, the enhanced fault simulator contains function , and keys and . As the enhanced fault simulator is delivered to core users in the form of object code, the decryption function and keys will not be released to core users. Similarly, there is no way for core users to decode the scrambled netlist because the function and keys are well protected within the enhanced fault simulator. Simulation results demonstrate that all benchmark circuits used in our experiment (Section VI) can be encoded and decoded in short CPU times, if the key length is not larger than 1024 bits.
B. The Overall Structure of the Test Scheme
Consider a set of cores and UDL used to implement a random logic with scan chains included. When these cores and UDL are tested, the best solution is to have them tested simultaneously, if the power dissipation is tolerable. This can be achieved by parallel scan chains which enable test patterns shifted in parallel. To control the test pattern shift time, the scan depth of each scan chain must be reasonably small. For example, if a core design is comprised of 1000 flip-flops and 5 scan chains, then the scan structure will result in 200 bits (clock cycles) per scan vector load (if scan depths are balanced).The scan interface will consist of 11 signals (5 scan inputs, 5 scan outputs, and 1 scan enable). So, there is a trade-off between test application time and test wrapper complexity. Basically, test pattern broadcasting can be implemented in two levels, i.e., test broadcasting for circuit blocks within a core (or UDL) as shown in Fig. 4(a) , and test broadcasting between cores and UDL as shown in Fig. 4(b) . Here, we assume that each core and each UDL individually do not have the test pattern broadcasting technique implemented. Thus, we just concentrate on test broadcasting between the level of cores and UDL (the general case can be easily extended). In this case, the application of test broadcasting will reduce the number of test pins for the entire chip, while the benefit of testing the cores and UDL in parallel still remains.
As shown in Fig. 5 , besides the hard core and its scrambled netlist, the core provider delivers a pattern suggestion table (PST), an enhanced fault simulator, an enhanced logic simulator, and a fault coverage booster for each core. The purpose of the pattern suggestion table is to provide a set of golden test patterns (prepared by the core provider) with specific fault coverage. However, the purpose of the enhanced fault simulator is to evaluate fault coverage of the core for random test patterns which are in fact the test patterns of UDL or other cores. If the enhanced fault simulator cannot give fault-free output responses (of the core) for these random test patterns as a side product, an enhanced logic simulator (which also takes the scrambled netlist as input) must be delivered as well. The output responses are required for test evaluation by digital signature analysis, for example. Given these test facilities, the cores can be tested when the UDL and other circuits are tested. If the fault coverage of a core is not high enough after nongolden test patterns are applied, to further increase the fault coverage, a set of deterministic test patterns can be selected by the fault coverage booster from the core's pattern suggestion table. If the pattern suggestion table is too large for delivering, it is possible to use an enhanced ATPG tool as a replacement. Similar to netlist encryption, it is also possible to encrypt the PST as a scrambled file, if the PST content must be kept confidential.
V. A COMPLETE EXAMPLE
The scenario of the proposed core test method can be described by the example shown in Fig. 6 . First, the gate-level representation of a specific core is shown in Fig. 6(a) , and its test vectors are shown in Fig. 6(b) . The test vectors can detect 11 faults as shown in Fig. 6(c) after fault collapsing. The pattern suggestion table is provided in Fig. 6(d) . Further, the circuit netlist can be described by the Verilog code shown in Fig. 7 .
Given the Verilog netlist, the encryption algorithm reads in the characters one by one and packs them into blocks. Note that space, semicolon, and carriage return are all treated as characters. In fact, all ASCII characters can be processed and each character is represented as an ASCII code, which contains eight bits of 0 and 1 patterns. The block size depends on the value of which is assigned (totally 128 bits) in this example. Thus, the block size is determined as 16 (128/8) characters. Each block is then encrypted using the RSA method [22] with the value of given above and assigned . Finally, the above Verilog code can be encrypted to the scrambled netlist as shown in Fig. 8 . By applying the same value for , and assigned , the scrambled netlist can be decrypted back to the original Verilog netlist.
Assume two test patterns are broadcasted to the core in Fig. 6(a) when the UDL are tested. These two test patterns are random test patterns for the core, and contain test vectors (001 100) and (010 101) as shown in Fig. 9(a) and (b) . The fault coverage of the core is 55% after the application of both vectors to the core. It can be found that five faults remain undetected (Fig. 9(c) ). If the core user has been satisfied with the fault coverage, then it is not necessary to test the core any longer. Otherwise, another set of test patterns must be selected out of the pattern suggestion table. As shown in Fig. 5 , core users only know the fault coverage; however, the system can also report all faults remaining undetected by a slight modification to the proposed core-based testing system. Given a set of undetectable faults, we must select a minimum number of test vectors from the pattern suggestion table for further increasing the fault coverage to a specific level. The problem is in fact the well-known table-covering problem, which is NP-hard, so good heuristic must be used. Here, we use a greedy method which always chooses a test vector to maximize the number of faults detected. As shown in Fig. 9(c) , is first selected from the pattern suggestion table because it is able to detect three faults. Then, and are selected for complete fault coverage. Note that core users can terminate the pattern selection process as soon as the fault coverage has been high enough to guarantee test quality. The IEEE P1500 Standard for embedded core test [11] is an IEEE standard which consists of two components: a core test language to facilitate the test knowledge transfer from core provider to core users, and a core test wrapper. The proposed test broadcasting method can be easily incorporated into the core design based on P1500. As shown in Fig. 10 , the multi-bit TAM plug (MTP) of each core is connected to the TAM channel which is in fact a test data highway, and the MTP for core logic 1 is four while that for core logic 2 is 7. Test data can be shared regardless of the number of scan chains in each core. Based on the concept of test broadcasting, the number of signal lines required for the TAM channel can be as small as where is the number of signal lines for the MTP of core or UDL . For the example in Fig. 10 , the number of signal lines for the TAM channel can be 7
, instead of 11 . Note that the test response of each core or UDL can be evaluated by digital signature analysis as shown in Fig. 10 . 
VI. EXPERIMENTAL RESULTS
To evaluate the performance of our proposed core-based test approach, we conducted a computer simulation experiment. The implementation contains two major parts: (1) test pattern broadcasting and golden pattern selection, and (2) netlist encryption and decryption using the RSA method [22] . First, a set of test patterns is generated for a specified core using the ATPG tool provided by SIS (sequential interactive synthesis) [23] . The faults detected by each test pattern are then derived using fault simulation. Thus, the pattern suggestion table for the core can be easily generated. The above process is repeated to a set of selected MCNC and ISCAS85 benchmark circuits for generating the pattern suggestion table of each core. Further, circuit is assumed an UDL circuit, while all other benchmark circuits are assumed core circuits. Therefore, the test patterns of are broadcasted to all core circuits as random test patterns. For each individual core, if the fault coverage under the set of random test patterns is not high enough, then extra test patterns are selected from its pattern suggestion table to further increase the fault coverage.
The simulation results are shown in Table I , where the first column gives the name of each circuit, while the second column shows the size of each circuit by the number of literal counts. The number of primary inputs for each circuit is listed in the third column, and the fourth column gives the number of test patterns required to achieve 100% fault coverage without test pattern broadcasting. The CPU time in seconds (on a SUN Ultra 10) consumed for the test generation of each circuit is shown in column five. The results for test pattern broadcasting are listed from column 6 to column 9. The simulation is performed based on the assumption that the test patterns of are broadcasted to each core as random patterns. Thus, column 6 gives the number of test patterns further selected from the pattern suggestion table for each core circuit to achieve 100% fault coverage. Columns 7, 8, and 9 give the numbers of test patterns selected from the pattern suggestion table for each core to achieve 98%, 95%, and 90% fault coverage, respectively. The CPU time (on a SUN Ultra 10) required for each execution of fault coverage enhancement is very small (0.01-0.02 seconds), so it is omitted. For example, circuit DALU (Data arithmetic and Logical Unit) has literal count 3588 and the number of test vectors required is 233 using traditional scan methods. However, by broadcasting test patterns of , only 129 more test patterns are required from the pattern suggestion table of DALU (Data Arithmetic and Logic Unit) for 100% fault coverage. Only 52 (21) more test patterns are required if 98% (95%) of fault coverage is to be achieved.
From Table I , it can be observed that the performance of the proposed approach highly depends on whether the core under test pattern broadcasting contains many random-pattern-resistant faults or not. For some circuits, such as , all faults can be detected after the broadcasted test patterns have been applied to the circuits. However, there are circuits where a very limited number of faults can be detected by broadcasting test patterns. For example, circuit can be tested by 218 deterministic test patterns with full fault coverage as shown in Table I . After random testing by broadcasting the test patterns of , only 12 faults can be detected and this requires another set of 206 test patterns selected from the pattern suggestion table of (for full fault coverage). As shown in Table I , by test pattern broadcasting, the majority of circuits require applying less than 50% of their original test lengths to get 100% fault coverage.
We also try to apply the golden test patterns (selected from the pattern suggestion table) of circuit pair as random test patterns to other circuit cores. We have observed that the numbers of test patterns required (selected from the pattern suggestion tables) to achieve 100% of fault coverage for some circuits, such as , have been reduced. For the example of circuit , it requires 157 deterministic test patterns for 100% fault coverage. After the golden test patterns of are applied, 84 more test patterns table of for full fault coverage. By broadcasting the test patterns of pair, we require only 74 more test patterns by selecting from the PST of to achieve the same goal. However, some circuits are not sensitive to the golden test patterns of pair, and thus more patterns must be selected from their PST.
To evaluate the performance for netlist encryption and decryption, we also implemented the RSA method in C++. Table II shows the encryption and decryption times for a few benchmark circuits with BLIF (Berkeley Logic Interchange Form) representations [23] . The key length is assigned 128 bits and the values of , , and are selected the same as those used in the complete example discussed above. It can be found from the table that the encryption and decryption times are quite symmetrical for each circuit, and the CPU times (on a SUN Blade 1000) are quite small for most benchmark circuits. The encryption and decryption times of each circuit are roughly proportional to the circuit size. We have also noticed that circuit too_large has both encryption and decryption times smaller than those of and , though the literal count of too_large is far larger. The reason comes from the fact that the former has a more compact representation in the BLIF form than the latter two circuits.
The data security of a coded netlist highly depends on the key length of , which determines the values of and . The larger the key length of , the more secure the encrypted netlist will be. Table III shows the CPU times (on a SUN Blade 1000) required with different key lengths of for each benchmark circuit selected in this experiment. We have observed that the CPU time for each execution of encryption/decryption is dramatically increased as the key length of is enlarged. For example, can be encrypted in 38.98 seconds when the length of is 128 bits. However, it is dramatically increased to 1674.31 seconds when the key length of is augmented to 1024 bits. To reduce the CPU time required, we try to compress (decompress) each netlist using the adaptive Huffman code compression [24] method before (after) it is encrypted (decrypted). In most cases, about 40% to 50% of the CPU time can be reduced for each execution of encryption and decryption after netlist compression. It should be noted that, when compared with CPU times for encryption and decryption, CPU times required by the adaptive Huffman code compression /decompression method are rather small (about 0.06-0.09 sec) and can be ignored. The advantages of using a mixture of different coding methods (e.g., RSA plus Huffman in this work) are: (1) the CPU time required for each execution of netlist encryption/decryption can be highly reduced; and (2) the netlist is more secure, because it is so hard to guess the combination of coding methods by hackers.
