Dynamic-current based test techniques can potentially address the drawbacks of traditional and I ddq test methodologies. The quality of dynamic current based test is degraded by process variations in IC manufacture. The energy consumption ratio (ECR) is a new metric that improves the effectiveness of dynamic current test by reducing the impact of process variations by an order of magnitude. We address several issues of significant practical importance to an ECR-based test methodology. We use the ECR to test a low-voltage submicron IC with a microprocessor core. The ECR more than doubles the effectiveness of the dynamic current test already used to test the IC. The fault coverage of the ECR is greater than that offered by any other test, including I ddq . We develop a logic-level fault simulation tool for the ECR and techniques to set the threshold for an ECR-based test process. Our results demonstrate that the ECR offers the potential to be a high-quality low-cost test methodology. To the best of our knowledge, this is the first dynamic-current based test technique to be validated with manufactured ICs.
Introduction
Traditional logic-fault model based methodologies test a circuit by observing the output logic values produced in response to specific input vectors. As device density and complexity have increased, logic test techniques alone have not been able to provide an adequate defect coverage. To improve real defect coverage, I ddq test methods, based on monitoring the static supply current in CMOS circuits have been widely used. With deep submicron technology, because of the increase in the number of devices per IC, the background leakage current is expected to rise sharply. The higher leakage current will degrade the quality of I ddq test, since the impact of a defect will be harder to detect. It has been recognized that new test methodologies are needed.
Test techniques based on monitoring dynamic supply currents are an interesting alternative. For example, a wide range of faults including redundant stuck-at faults, open defects and shorts can be detected by monitoring the dynamic current consumed by a circuit.
This research was supported in part by the NSF through grant No. MIP-9502240 and by Guidant Corporation Average dynamic currents are far larger, and hence easier to measure than are static currents. The quality of dynamic current test methods is markedly degraded by normal process variations which cause the current consumed by a fault-free circuit to vary substantially. The impact of process variations can be substantially negated by using the ratio of energies consumed on two distinct input transitions as a test metric. The new metric, the energy consumption ratio, markedly improves the fault coverage of dynamic-current based test even in the presence of substantial process variations.
Contributions We verify the real quality of the energy consumption ratio (ECR) by applying it to a low-voltage submicron biomedical IC with a microprocessor core. A significant fact in our experiments is that the test vectors constituting the ECR were chosen independently by the manufacturer, and not with the goal of maximizing its fault coverage. We demonstrate that even for manufactured ICs, the ECR is more tolerant by an order of magnitude to the impact of process variations than its component dynamic currents. We develop techniques to set the test thresholds for an ECR-based test process. We develop an accurate logic-level fault simulation tool, Ecrsim, for the ECR. The tool reduces computational costs while preserving the accuracy offered by circuit simulation. Other dynamic current test techniques reported in the literature require circuit simulation to estimate the fault coverage they provide. We also verified the quality of the ECR test on a production IC. Our results show that ECR-test more than doubles the effectiveness of dynamic-current test. The ECR offers a higher defect coverage than any other single test, including I ddq test. Yet, ECR-based test offers several significant advantages over I ddq test. To the best of our knowledge, this is the first dynamic current test technique validated on a manufactured IC.
Previous Work
Rather than voltages, I dd test methods detect faults that affect functionality by monitoring the current drawn by a circuit. One may observe the supply current either when the inputs are static, or when the circuit is responding to input transitions. I ddq methods measure the current drawn when circuit inputs are quiescent, and verify if the measured current is below a (preset) threshold value. The test threshold has to be set to account for the leakage current contributions of all the devices in the circuit and the impact of process variations on the leakage currents. The ability of I ddq methods to detect a wide range of faults, such as bridging faults, has been investigated in detail [1] - [3] . However, I ddq methods are not effective if open faults are present in the circuit [4] .
It is expected that deep-submicron technology will degrade the effectiveness of I ddq test methodologies. Because of the increased _ ___________________________ Permission to make digital/hardcopy of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. DAC 99, New Orleans, Louisiana (c) 1999 ACM 1-58113-109-7/99/06..$5.00 number of devices in a single IC, total sub-threshold leakage currents in deep sub-micron devices [5, 6] are expected to be quite substantial. It will be increasingly difficult to detect the impact of a localized defect on the total static current consumed by the IC. Current signatures, based on a series of I ddq measurements, have been suggested as one method to overcome high leakage currents.
Recently, I ddt test methods which measure transient or dynamic supply currents to detect faults have been proposed [7] - [14] . Similar to I ddq methods, it is not required to propagate the fault effect to the primary outputs of a circuit. Each I ddt test method exploits different circuit characteristics. The method in [7] pulses the supply rails to detect faults. This technique can be applied to both analog and digital circuits. The methods in [8] - [10] monitor the current consumption in individual gates, or a group of small gates. Other techniques monitor the current drawn by the entire circuit. An orthogonal issue is the type of dynamic current to be monitored. The techniques in [8] - [11] monitor the transients on individual transitions. An alternative is to monitor the average current drawn by the circuit under test [13, 14] . I ddt methods have been shown to have the potential to detect a large number of faults that either cannot be detected or are hard to detect with other test methodologies. For example, test techniques based on monitoring a circuit's energy consumption can detect redundant stuck-at faults and open faults.
We address issues of practical importance related to the application of the new test metric, the energy consumption ratio (ECR), that was developed in [14] .
Energy Consumption Ratio Test
The dynamic energy consumption of a circuit can be monitored to detect the presence of faults. This section is a summary of the material in [13, 14] . It has been included to enable the reader to better understand the remainder of the paper.
Consider a stuck-at fault in a circuit that affects steady-state internal signal values. The fault also alters the way internal signals transition in response to input changes. That is, a fault alters the number and location of signal transitions that occur in response to input changes. Thus, a fault can alter the energy consumed by the circuit. This energy consumption change can be detected by monitoring the average dynamic supply current [13] . We will refer to a vector which detects faults by monitoring the power dissipation as an energy test. The power dissipated by the circuit, in general and on specific transitions, may be different from the value expected because of process variations. These variations will degrade the fault coverage. In integrated circuits the relative variation of the parameters across the IC is low. To negate the impact of process variations, this fact can be exploited as follows. Note that gate capacitances are the primary source of dynamic energy consumption. The average current is proportional to the number of signal transitions and transistor gate capacitances. Assume that because of process deviations, the unit gate capacitance is altered by a factor of k 1 from its nominal value across the IC. The average current on a single transition is altered by this variation. However, to a first order the ECR is immune because the variations in the numerator and denominator cancel each other. To detect faults with an ECR, P 1 is an energy test and P 2 is a reference benchmark transition. The energy test maximizes the impact of a fault on the average current. The ECR of a faulty circuit is similarly immune to process variations. It can be shown that the ECR reduces the impact of process variations without reducing the impact of the fault. Readers are referred to [14] for more details.
ECR Fault Simulation
In ECR-based test, the detectability of a fault is determined by its impact on the energy consumption of a circuit. To estimate the effectiveness of a specific pair of potential test vectors, the average transient currents in the good and faulty circuits need to be computed. Circuit level simulation for every potential solution is too expensive, or even may not be possible for large circuits. A logiclevel ECR fault simulator is needed. We develop such a fault simulator in this section. There are two significant issues we address in addition to the normal goal of estimating power dissipation. We are interested in simulating the impact of faults on the power consumption. We are also interested in representing the impact of process variations, in the good and faulty circuits, at the logic level.
Estimating Circuit Power Supply Current
The first order approximation of estimating the ECR to be equal to the ratio of the number of transitions does not offer sufficient accuracy. Therefore, for fault simulation, we use a library-based approach similar to that used in [15, 16] . We assume CMOS circuits are designed using a predefined cell library. All elementary gates in the library are characterized once, and their models are used during the gate-level logic simulation to provide power information with less computational effort. To model the impact of process variations, library elements can also be characterized using data from multiple process runs. For our experiments, we have characterized the library in 44 different processes provided by MOSIS [17] .
In order to estimate the actual shape of the transient waveform, many library-based estimation techniques store the actual transient waveforms for library components. This increases the amount of information to be stored. For example, transient waveforms for each instance of a gate, with a differing number of inputs, differing transistor sizes and for different process runs will have to be stored separately. Since we are interested only in the average current, and not the transient, model representation can be greatly simplified. Each model's average current is a function of the fanout and fanin number of the gate characterized except the NOT gate which is simply a function of its fanout number. For different process runs, the current functions change correspondingly.
The currents consumed on input transitions can now be estimated through logic-level simulation. On any transition, only gates that are activated consume currents. The current consumed by an activated gate is extracted from the library information. For gate simulation, similar to circuit simulation, we assume that input transitions are spaced so as to permit the transient power supply current of the circuit to settle down during that period. This requirement guarantees that each activated gate contributes all its modeled power consumption. The average current is the sum of individual currents for all activated gates. That is,
where I i is the set of activated gates. For Ecrsim, the estimate of the percentage power consumption difference or the estimated percentage current difference (PCD) between the good and faulty circuit is the most significant metric. The PCD is: 
Results
The evaluation of the quality of the ECR fault simulator focused on a single circuit s1238.bench and on 9 logically redundant stuck-at faults in the circuit. Redundant faults cannot be detected by monitoring the logic outputs. However, it is possible they impact the supply current and can be detected by monitoring the ECR. We use two terms to quantify the impact of process variations.
Definition 2 The term Process Faulty Circuits refers to the same faulty circuit but produced by different process runs.

Definition 3 For a specific stuck-at fault, the term Process Faulty
Coverage is defined to be:
Number o f process runs f or which the f ault is detected Number o f process runs simulated
The ideal process fault coverage is 100%. The faults were chosen such that they had a limited impact on the energy consumption of the circuit. Thus, potentially, the process fault coverage for these faults may not be 100% with dynamic current test, or even with ECR test.
Experimental Procedure The experiments were conducted as follows. For each fault, a test pair of vectors were generated using a test pattern generation algorithm [14] . The algorithm was also used to generate a benchmark test pair that was common to all faults. the vector pair generated was simulated with Ecrsim, as was the benchmark pair. The simulation is used to estimate the currents consumed by the good and faulty circuits, and the corresponding ECRs.
The vector pair generated was simulated with Hspice, as was the benchmark pair. The simulation is used to compute the currents consumed by the good and faulty circuits, and the corresponding ECRs.
To gauge the impact of process variations, the currents are estimated and computed using Ecrsim and Hspice, over 44 process runs.
Absolute Accuracy The data in Table 1 evaluates the absolute accuracy of the Ecrsim. The currents (estimated and computed) represent average values over 44 process runs. The first column identifies the location of the redundant fault. The data in columns 2, 3, and 4 represent the results estimated using Ecrsim. Columns 2, 3 and 4 represent the estimated average currents, good circuit ECR and the percentage current difference caused by the fault. The ECR data in Column 3 are the ratios of the currents in Column 2 to the estimated benchmark current (not shown in the table) in the good circuit. The benchmark current is the current consumed by the benchmark transition, which was chosen such that it was not affected by any fault. The data in columns 5, 6, and 7 contain the corresponding results computed using Hspice. The average error in estimating current consumption is 3.7%. The average error in estimating the ECR of the good circuit is 5.9%.
Fault Simulation Accuracy
In almost every case, the impact of the fault is larger than the errors in estimation plus three times the standard deviation of the ECR. Therefore, even in the presence of estimation errors, the detectability of a fault can be estimated accurately. Table 2 compares the estimated (using Ecrsim) and computed (using Hspice) fault coverages for both transient current test, and for ECR test. In every case but one, Ecrsim accurately estimates the process fault coverage.
For every fault but one, there is a substantial overlap between the faulty and good circuit current distributions. (This data is not shown in Table 2 .) Thus, in most cases the process fault coverage is very poor. In contrast, for every fault, there is no overlap between the faulty and good circuit ECR distributions. Thus, the ECR always provides 100% process fault coverage. Recall that the percentage impacts of a fault on the transient current and on the ECR are identical. However, the impact of process variations on the ECR is only a tenth of their impact on the transient currents. For a specific fault, the ECR offers a higher process fault coverage because it reduces the impact of process variations without reducing the impact of a fault. Since a fault has to produce a smaller impact to be detectable, the ECR improves overall fault coverage.
Computational effort for fault simulation can be further reduced by observing that, though the currents consumed by the good and faulty circuits vary markedly, the actual percentage current difference caused by a fault is relatively constant across all process runs. The average standard deviation in the PCD is 0.08%. (Since it too is a ratio, the standard deviation in the PCD will be of the same order of magnitude as the variations in the ECR.) Therefore, the percentage impact of a fault on the ECR can be reasonably estimated by simulating the circuit with one process, instead of all available processes. The information generated with such an approach can reasonably be used to estimate the detectability of a fault.
The exact transients, either for the circuit as a whole or for a small group of gates, on a single transition are harder to estimate accurately at the logic level, than is the average current. The impact of process variations on the transients is also harder to estimate as well. Therefore, fault detection techniques based on monitoring single transients, or the current consumption at individual gates, or small groups of gates [8] - [12] G78 !G456 9.1 9.1 100 100 Table 2 : Fault simulation accuracy of Ecrsim techniques are presently not able to offer the accuracy required. In contrast, the Ecrsim fault simulator is able to provide accurate fault detection information using logic level simulation.
Application to IC Test
An alternative to targeting specific faults with the ECR is to use it much as I ddq test is used. Many commercial I ddq test methodologies are not based on targeting specific faults, but on characterizing the behavior of a fault-free circuit. The test process simply bins products. Similarly, ECR-test can be used to bin products by defining an acceptable range of current ratios, and declaring all ICs whose ECRs fall outside the range to be faulty. To investigate its effectiveness in such a test methodology, we applied the ECR to an IC produced by a local manufacturer of high reliability ICs for biomedical applications. For each device, 14 tests (listed in Table 5) were applied after manufacture, but before packaging. Dynamic current tests examine the switching behavior of the device. There are two dynamic current tests called drxl60 and burnin. The burnin is a self-test type vector sequence that is executed on start-up. The burnin test is designed to cause switching activity in every component of the system, except the memory. The drxl60 is a specification mandated test sequence that exercises various components. The vector sequences in the dynamic test do not significantly excite the RAMs and ROM. More specifically, readers may note that the dynamic current components were not chosen so as to maximize the benefits of our test technique.
Experimental Procedure
The goals of our experiment were to verify the effectiveness of the ECR, and to compare its effectiveness with that of other tests. The experiments were conducted as follows. For our experiment, all the tests in the test suite were applied to each IC. For each device, the set of tests it fails is enumerated. Conversely, for each test, the set of devices which fail it is enumerated. The ECR was applied to the test process as follows:
1. The ECR was defined to be the ratio of the current on the drxl60 sequence to the current on the burnin sequence. Since these tests were already conducted as a part of the regular process, using the ECR simply involved reinterpreting existing test data. In other words, we did not specifically generate tests to maximize the effectiveness of the ECR.
2. The goodness of a device was defined by the regular test process. A device was declared to be good if it passed all the regular tests. It was declared to be defective if it failed one or more regular tests. The data was collected on 887 devices over 6 wafers produced together. Out of the 887 ICs, 326 were identified as being defective, and 561 as being good.
3. A threshold was set for the ECR. A device passed the ECR test if its ECR fell within a threshold of the expected value, else it failed the ECR test. As seen in Table 4 , the good circuit ECR values are closely clustered. The standard deviation for the drxl60 current is 0.0079 and the standard variation for the burnin current is 0.0076. However, for the ECR ( drxl60 burnin ), the standard deviation is 0.00086, almost a tenth of the value for the individual currents.
ECR Test Threshold
To apply the ECR test to the sample data, pass and fail ranges for the ECR values have to be defined. The lower and upper boundaries (0.8009 and 0.8046 respectively) of ratios of the good devices obtained above are good candidates to serve this purpose. Thus, for these thresholds all the good devices that pass all the original tests also pass the ECR test. Any device whose ratio falls outside this range fails the ECR test. 5. The fifth row identifies the test that provides the maximum conditional fault coverage for test T k , as well as the corresponding percentage coverage.
ECR Test Performance
The fault coverage of the ECR test is greater than that offered by any other single test.
Readers may note that the components of the ECR were developed independently by the corporation. They were not chosen so as to maximize the coverage of the ECR test.
Comparison to Dynamic Current Test One may observe that the ECR significantly improves the effectiveness of its individual dynamic current components. It can be seen that only 104 devices failed the dynamic current test, which gives it a fault coverage of 31.9%. However, 249 devices failed the ECR test, and the fault coverage is 76.4%. The ECR more than doubles the effectiveness of the transient current tests. Notice that the number of devices which failed both the dynamic test and ECR test is 103. This indicates that more than 99% of the devices that failed the dynamic current test also failed the ECR test. Whereas, fewer than 50% of the devices that failed the ECR test failed the dynamic current test.
Comparison to Other Tests
The ECR test also offers excellent performance relative to other tests. For virtually every test, the ECR test offers the highest conditional fault coverage. For any test T k , on the average the ECR detects 94.8% of the faults detected by test T k . The lowest conditional fault coverages are associated with the memory tests. The ECR detects only 78.4% of the devices that fail the ram4k test. In fact more than 80% of the 77 defective devices that escaped detection with the ECR test failed one of the memory tests. One possible explanation for the lower coverage is that the vector sequences in the ECR do not exercise the memory significantly. The large relative fault coverage offered by the ECR indicates that it can potentially serve as a good screening test. If the ECR test is applied first, the number of faulty devices to be detected by each of the remaining tests is greatly reduced. One advantage offered by the ECR is that average currents can be measured with simple test fixtures. This reduces overall test costs.
Comparison to I ddq Test
In the original test suite, the fault coverage of I ddq test is greater than that of any other test. Though it offers a high fault coverage, the excessive test application time is a significant disadvantage relative to other tests. The fault coverage of the ECR is more than 4% higher than that of the I ddq test. The conditional fault coverage of the I ddq tests is higher than that of the ECR for only two tests, the DC parametric and static current tests. For other tests, such as the rom test, the conditional fault coverage of I ddq test is significantly lower. ECR test offers several advantages relative to I ddq test.
The currents for the dynamic tests are in the µA range. The currents for the I ddq tests are in the tens of nA range. In general, the larger average currents are far easier to measure than the small static currents.
The application time for I ddq tests is significantly higher than for the dynamic current tests. For each I ddq vector, substantial time is required for the signals in the circuit to resume a quiescent state.
I ddq tests have been criticized for failing a large number of devices which have passed all other tests. The ECRs for fault-free circuits have been shown to be very tightly clustered, both in simulation and in practice as well. The probability of fault-free circuits failing the ECR is negligible.
The impact of a defect on the static current cannot be magnified. This is especially critical in circuits with a large number of transistors. Thus, high quality measurement tools on expensive testers are needed to detect faults. The ECR markedly reduces the impact of process variations while preserving the impact of a fault. Further, the impact of a fault can also be increased by increasing input frequency and/or the supply voltage.
The effectiveness of I ddq techniques can be increased by using on-chip partitioning techniques, and on-chip sensors. Based on the available evidence, it is reasonable to believe that a similar approach would also be effective for ECR-based test. Further the sizes of the partitions could be significantly larger than for I ddq test.
Relative to dynamic current test techniques which monitor the transients for a single transition, ECR-based test offers the advantage of requiring less accuracy in synchronization. When monitoring individual transients, phase errors in measurement can potentially invalidate the test.
Cross-Wafer Threshold Performance
The results discussed above were obtained by computing the ECRs for wafers which were defined as being good. Clearly, during a regular test process such information is not available in advance, as it was for this experiment. Several alternatives can be explored to set the threshold. For example, the ECR can be computed for all the devices, and the mean of the resulting distribution can be used to set the threshold. The thresholds will be skewed if several defective devices have near identical ECRs. A second approach is to randomly subject a few of the devices to other tests, and use the fact that the ECRs of good devices are tightly clustered to identify suitable thresholds. The limitation of both these approaches is that the ECRs for all the devices have to be measured before pass/fail decisions can be reached.
We investigate a third approach: using the information from one wafer to set the thresholds for the remainder of the test process. Table 6 shows the results obtained when such an approach is applied to the sample data. The first row represents the results obtained when data from all wafers is used to set the thresholds. The remaining six rows show the results obtained when data from a single wafer, identified in the first column, is used to set the thresholds. Table 6 : Cross-wafer ECR threshold performance
Conclusion
Normal process variations can significantly degrade the quality of transient test techniques. A new metric, the energy consumption ratio (the ECR) can, to a first order, negate the impact of process variations while preserving the impact of a fault on the power dissipation of a circuit. The ECR demonstrated through simulation the potential to offer significant improvements in fault coverage over transient current test. We addressed several practical issues which have to be resolved to apply the ECR to IC test. Though it is an accurate methodology, circuit simulation is too expensive to use as a tool to assess the detectability of a fault using the ECR. We developed a logic-level fault simulator for the ECR which accurately estimates the detectability of a fault in the presence of process variations.Other dynamic current test techniques reported in the literature require circuit simulation to estimate fault detectability. We demonstrated the ability of the ECR to detect real defects through the application of the ECR to a low voltage sub-micron IC product. The potential effectiveness of the ECR was confirmed by the results achieved with the target IC. Readers may note that the currents in the ECR were already components of the test suite for the product, and were not chosen specifically to maximize the fault coverage of the ECR. The ECR was computed by simply reinterpreting the existing test data. The test process was conducted by setting a threshold for the ECR, and failing all units whose ECRs fall outside the threshold. Since the ECR is nearly constant across wafers, simple sampling techniques can be used to set the threshold in a real test process.
The ECR more than doubled the fault coverage of the transient tests. For every other test, on average the ECR detected nearly 95% of the defective units detected by that test. The coverage achieved by the ECR was higher than that offered by any other test, including I ddq test. Yet the ECR is significantly less expensive than I ddq test in terms of test duration, the difficulty of application, a low probability of failing good units and overall test costs. An ECR-based test methodology can be a high-quality screening technique that can be applied using simple testers. To the best of our knowledge, this is the first dynamic current-based test technique to be validated through extensive application to a manufactured IC.
