Abstract
Introduction
Testing accounts for substantial costs in the manufacture of semiconductor devices ("chips"). According to industrial sources, the already substantial testing costs are expected to rise further in the next decade [6] . Test related costs are incurred at several points during the entire design and manufacturing cycle. A recent cost model [11] identifies four major cost components of testing related to test preparation, test execution, test-related silicon costs, and imperfect test quality. The last component is the focus of this paper. We propose an analytical approach for accurately estimating thè reject ratio' [1] (also called,`escape rate' or`defect level' in the literature). The results may also help ascertain yield loss due to non-functional testing methods, such as IDDQ and scan.
Test plans for currently manufactured devices involve some combination of tests for functional, scan, IDDQ, delay, and possibly, bridging faults. As no single type of test is believed to catch all defective parts, cost-benefit analysis is made for including different types of tests and deciding y Home institute: University of Pune, Pune, India their order of application. Given that test preparation and application costs generally rise with test length, the question arises: when is the point of diminishing return reached for a particular type of test? It has been shown that the stuck-at fault coverage of a test is a very unreliable measure to answer this question [4] . In this paper we provide an alternative way of answering the question.
Past Work
Recognizing that the reject ratio, while easy to define, is hard to estimate through direct measurement, researchers have proposed several alternative approaches for this purpose during the past twenty years. Common to all the methods is statistical modeling of the circuit testability, the chip testing process, or both. However, the earlier models [1, 3, 7, 10, 12] fail to account adequately for the frequent jumps (discontinuities) in the chip tester data.
In earlier work [2] , we posited such jumps to be an essential characteristic of the tester data and accounted for them in a statistical model by introduction of fault latency as a new parameter. The model provided an expression for the yield of chips after application of n test vectors in terms of fault latencies and detectabilities as parameters. These parameters are estimated directly from the tester data, as is the process yield, thus allowing an estimate of the field reject ratio.
An attractive feature of the above approach is its independence of any specific fault model and fault coverage. These are implicitly accounted for by the latency and detectability parameters. Because these parameters are estimated from the tester data, the model can self-adjust for any inadequacies of the test.
Our recent investigations indicate that the reject ratio estimates by the latency based approach may be unduly optimistic and not quite robust. One source of error can be traced to the way the process yield is estimated from the tester data; even a slight error in the estimate can have a significant effect on the calculated reject ratio. Another source of error comes from the assumption that the range of fault latencies is unbounded but all accounted for by the end of the testing process.
Our proposed approach differs from the previous ones in that no explicit failure mechanism is assumed in our model. In the prior studies, the modeling parameters came from such assumptions, e.g., fault coverage and yield [10, 12] ; fault coverage, yield, and the average number of faults on a defective chips [1]; Stapper's two yield parameters and the average number of faults per defect [7] ; and yield, fault latencies, and fault-detection probabilities [2] . We believe the failure mechanisms to be too numerous and complex to be captured accurately by these means. Instead, we derive our model parameters from commonly observed characteristics of faults, testing process, and test data that appear to be invariant across a range of manufacturing processes.
An advantage of the proposed approach is that a yield estimate is not required, thus, it can be applied also during the yield-learning phase of manufacturing. All previous methods require a yield estimate and their results are very sensitive to this value. On the other hand, because the approach requires tester data for analysis, an obvious disadvantage is that the proposed analysis can only be carried out after a batch of devices have been fabricated and tested.
An Empirical Model for Testing
Our method is based on two statistical quantities, namely event-probability and event-size distribution probability. We believe that these probabilities are well-behaved functions of test vectors in every test result and can be modeled by simple functions. These functions are used to model the cumulative fallout (number of detected faulty devices). The estimate for the reject ratio is determined by computing the asymptotic value of the cumulative fallout.
In the rest of this section we first state and justify the general assumptions used in our model regarding the faults, the testing process, and the test data. As illustration, we use wafer test data for a high-volume digital CMOS device obtained from Delco Electronics; the wafer test for the device covered 99.7% of the stuck-at faults [3] . Future work will include validation of the model in terms of other test data sets available to us.
Fault Characteristics
We distinguish between faults or functional deviance from physical defects (or simply, defects). The latter occur during device fabrication and may cause faults. Unfortunately, it is hard to characterize precisely which defects are likely to occur and with what frequency. Further, the commonly used fault models capture only a small subset of all faults and the relationship between defects and modeled faults is not always easy to establish. In view of these uncertainties, we state and justify two rather general assumptions about defects and faults that are useful in analyzing device test data.
Assumption 1 Device defects cause varying degree of damage from the testing viewpoint, i.e., some defects are detectable by a very large number of test vectors while others are detectable by only a few vectors.
The assumption can be justified in terms of defect size which varies over a wide range, causing varying degree of damage in the logical domain, see [8] . Also, the vulnerability of the chip function varies with the location of the damage. Further justification for the assumption may be found in the great variance observed, typically, in the detectability of stuck-at faults in a circuit.
Assumption 2 Defects that affect many devices occur less frequently than those that affect fewer devices.
This assumption is true as a consequence of the yield learning that takes place with maturation of a fabrication process. In order to improve the yield it is most beneficial to eliminate defects in order of frequency of occurrence. Thus effort would be directed towards identifying and removing the causes of frequently-occurring defects.
Testing-Process Characteristics
Test vectors may be functional, generated randomly, or generated by targeting faults based on a fault model. In this section we argue that even when a fault model is used, the test generator has only a partial information and the characteristics of the test results do not differ, especially towards the end, no matter how the test vectors are generated.
As already discussed, fault models have a rather uncertain relation with the actual defects that are themselves hard to characterize precisely. The most significant information which is not available to a test generator is the number of chips affected by a given fault (the fault occurrence probability). Another unknown is which set of faults are caused by the same defect. Ideally a test generator needs to generate test for only one of them. Aside from this, most fault models tend to ignore complex faults. It is often difficult to generate tests without assumptions such as single-fault, no-bridging-fault, etc.
While targeting one fault, several additional faults are also detected. It is justified to assume that the chances of detection of these additional faults should be about the same even if the test vectors were generated randomly. As the test progresses, the set of undetected faults begins to be dominated more and more by the unmodeled faults. Since unmodeled or uncovered faults can only be detected by happenstance, in the later part of a test the fallout should be materially independent of how the test was generated. The defect level (reject ratio) of the final yield depends only on these uncovered or unmodeled faults so it should be possible to estimate it by extrapolating a function that models the later results with high fidelity. Figure 1 shows typical per-vector device fallout due to a test. Figure 2 shows the cumulative fallout. It is clear from the figures that there is a large variance in the fallout size. In this case a natural question arises, namely, can there exist a smooth (differentiable) curve which could be considered a good approximation to the cumulative fallout.
Test-Data Characteristics
In the initial part of the test the variance in the fallout is so large that no smooth curve can possibly be considered a good fit. But in the later part the fallout size varies in a narrow range so a good fit could be found. This assertion was verified for our test data set using standard measures of goodness-of-fit [9] .
Since the reject ratio depends on the number of chips that would have been found faulty had the test been continued, it could be estimated with a significant certainty by fitting a smooth curve on the the cumulative fallout curve and extrapolating it.
Structure of the Model
As discussed above, we shall view testing as a random process. ft will denote the random variable representing the number of chips found faulty on the application of test vector V t for 1 t T. N will denote total number of chips.
Modeling the testing process involves development of a parametric expression, f M t, approximating the expectation value of ft. 
Reject Ratio Computation
In order to estimate the reject ratio, first, the parameter values should be determined so that f M t fits well to the actual instance of ft. where r T denotes the contribution to the reject ratio by the faulty chips detected by the remaining test vectors, i.e., v t 0 for t t 0 T. r T f M ; t is useful in checking the robustness of the model by comparing it with r T f;t, which can be computed from the actual data. Table 1 shows the values of r T f;t and r T f M ; t for various values of t in our experiment with Delco data.
Expectation Value of ft
Definition We consider failure of one or more chips at vector V t (i.e., ft 0) as an event at t. Let a binary random variable t characterize the event at t. Its value 1 shall denote that event has occurred at t with size ft. 
Figure 3. Cumulative events vs vector number
where P r o b ft = j t = 1, event-size distribution probability, denotes the probability of V t detecting faulty chips given that event occurs at time t and P r o b t = 1 , event probability, denotes the probability of an event at time t. It is assumed that event size is bounded above by mt.
Event Probability
Assumption 1 suggests that the defects with higher visibility (number of test vectors that can detect it) are likely to be detected earlier in the test. Therefore it is expected that the event probability should decrease as the test proceeds. Figure 3 shows the cumulative number of events, P t t 0 =1 t 0 .
Here we develop a model based on a simpler assumption: the probability of an event is proportional to the number of undetected faults.
Let Ft denote the number of faults remaining undetected after t test vectors. If we assume that the rate of detection of new faults is proportional to the number of undetected faults, then Ft can be approximated by F0e ,c:t .
Assume that every input vector detects a set of S faults, where S is a random variable with a narrow range. Of the total possible subsets of S faults, Figure 4 shows the best fit of the event curve by Equation 3. The fit is especially good in the second half of the test.
Event Size Distribution
From Assumption 2 it can be deduced that the faults that affect large number of chips, occur less frequently. On this basis we conclude that event-size distribution probability should have negative correlation with .
In this study, a simple model for the event size distribution is used, namely, P r o b ft = j t = 1 1= k for 1 mt, where mt denotes the size of largest possible fallout by V t . Figure 5 shows the event size distributions in various windows of 1000 test vectors and their approximation by the model curves l= k . The optimum values of k in each case is found to vary slightly so we replace k by kt = a + b:t (4) Normalizing the function, we get the expression for event-size probability P r o b ft = = t = 1 = 1= kt :1= P mt 1 1= kt ; 1 mt (5) As observed earlier in this section, the variance of the fallout, mt, decreases quickly as the test progresses. In this work we have modeled mt in the same way as the event probability so mt = m 0 :1 , 1 , e ,gt h where kt are mt are given by Equations 4 and 6 respectively.
Model Verification and Results
In this section we test the robustness of the model, Equation 7, by comparing the actual and the predicted reject ratios. Figure 6 shows the actual and model cumulative-fallout curve. Since the model curve represents the expected value of the fallout, it can be considered a very good fit to the actual data which are an instance of the fallout.
To determine the goodness of the fit, partial reject ratios, r T , were computed from the actual data and the model.
These ratios, as parts per million, are given in Table 1 
Prediction
The reject ratio was estimated using Equation 1. Limiting value of the cumulative fallout, the numerator of r.h.s. of the equation, was estimated by extrapolating the model curve of Figure 6 . The value of the estimated reject ratio for various parameter values, determined from different local optima, varied from 1305 ppm to 2480 ppm for this data set of Delco test set. The narrow range of variance indicates that the model is robust. 
Conclusion
The proposed approach allows, by extrapolation of the test results, to estimate how many more devices would eventually fail if the test were to be carried out indefinitely. It needs to be noted that the criterion used for determining what is a good vs bad device may differ with test type. For example, scan tests may reject a functional device that fails on a non-functional vector. Similarly, delay test may reject a device that is functional but does not meet the delay requirement. Thus, the extrapolation would give a value of the reject ratio that is specific to a type of test.
Recent pilot studies have produced test results for different types of tests (scan, functional, etc.) on the same set of devices [4, 5] . With these data sets it will be possible to study and compare the projections of our model for the various types of tests. We are also considering how the model can be extended to estimate reject ratio when a combination of tests is applied to a batch of devices. There is enough information available in the data sets to verify any extensions to combined tests.
