Abstract -A method for determining a test chip sample size to estimate effectively the electrical parameter distributions on an integrated circuit wafer is presented. This method gives relations among sample size and the figure of merit for four statistical techniques (trimmed mean, biweighted mean, median, and arithmetic mean) by which estimates are calculated. To demonstrate its use, the method has been applied to the evaluation of a CMOS fabrication process. Measurements on wafers completely patterned with identical test chips were used to determine actual parameter distributions for an entire wafer (true parameter values). Estimates of true parameters were determined using a site selection plan which is representative of sampling plans employed in industry. The above four statistical techniques were used to compute estimates for electrical parameters and their respective figures of merit. These estimates were compared with the true parameter values determined from testing all test chips on the wafer. This method may be used in conjunction with other criteria for determining test chip sample size and enables one to make judgments on the effectiveness of sampling strategies for various processes and process technologies. The results, reported in this paper for CMOS processes, are interpreted using graphs of the figure of merit versus the sample size.
I. INTRODUCTION
A COMMON method used to assess and monitor the control of custom integrated circuit fabrication processes is to substitute a test chip or "drop-in" containing many individual test structures at various locations on the wafer in place of product die. These chips are tested to determine the value and range of various electrical parameters which can be correlated with parameters in the fabrication process. With the increasing complexity of custom, low volume circuits, results from these test chips can also be used as the basis for wafer acceptance. Because the test chips replace product sites and therefore reduce product yield, and result in increased product cost, very few test chips (typically four or five) are included on a production wafer. Consequently, the accuracy to which critical electrical parameters can be estimated and process performance evaluated is presently dependent on test'results from these few test sites. This paper presents a method for relating the test chip sample size to a figure of merit for a given statistical estimation technique to be performed on the sample. This method can be used as a basis for determining the most effective test chip sample size and estimation technique for a given electrical parameter and process.
To demonstrate the usefulness of this technique, test results from a number of wafers entirely patterned with test chips were evaluated. The data used in this paper consist of six device parameters from several different fabrication processes. These parameters are metal-to-p+ and metal-to-n+ contact resistances, MOSFET p-channel and n-channel threshold voltages, and p-and n-channel MOSFET transconductance.
These parameters were selected to represent extremes in critical palrameter spatial distributions. Previous work [l] using these data showed that MOSFET threshold voltage and transconductance exhibited the most stable range of spatial distribution and contact resistance exhibited the least stable one. The fabrication process was a silicon gate complementary metaloxide-semiconductor silicon-on-sapphire (CMOS/SOS) technology. Additional electrical measurements from a pwell CMOS bulk process were also examined.
STATISTICAL ESTIMATING TECHNIQUES USED
Conceptually, a model for the distributions assumed in this study might involve one component representing the bulk of the data (symmetric, but with heavier tails than the commonly referenced Gaussian distribution) and another component representing the highly outlying values occurring far from the others. Such a situation occurs frequently in electrical testing on wafers due to process irregularities. The parameters for the distribution representing the bulk of the data are of primary interest, yet a sample from this population could include one of these outlying values. A primary goal of estimation would focus on the mean of the major distribution and downweight observations from the outlying distribution. The four estimators described in this section have been chosen in t h s study to deal specifically with such populations, i.e., ideally Gaussian but possibly contaminated either by stretched tails or outliers, or both. 'This section will describe the four estimating techniques 0018-9383/84/0200-0257$01.00 01984 IEEE 2, FEBRUARY 1984 that were applied to each sample generated from the t.:ue population of measurements. Three of these estimators (median, biweight, trimmed mean) have been shown to be highly robust estimators of the mean of a symmetric, possibly contaminated population [2] , [3] . The sample mcan was included because of its popularity and for comparism. The sampling procedure and a description of the method used to obtain a statistical efficiency or figure of merit Tvill be discussed in later sections.
A. Median and Arithmetic Mean
The first two estimation techniques applied to the samples were the median and the arithmetic mean. The med:.an was defined as the data value that is in the middle when the measurements are arranged in order of magnitude [or as the average of the middle two observations when .;he sample size is even). If x(l),. *,X(,,) represents the ordered sample of n observations
For completeness, the definition of the arithmetic mear. is provided
where n is the sample size.
B. The Biweight Statistic
The biweight [4] is a robust estimator that assigns. a weight w(u) to each data point in the sample according to a multiple of its standard deviation from the biweigh1:ed mean T. The biweighted mean T and the variance VBi can then be determined from the weighted data points.
For data points, xl,. . . ,xn, the biweighted mean J is defined as the solution to (1) where 'k is given by
for IuI >1 ( 0 where The biweighted mean T is an iteratively reweighted mean (the weights change with each updated T ) , and a zero weight occurs whenever (xi -T ) / s > c. Equation (3) permits an iterative solution for T. Hence, the following algorithm was used to calculate the biweighted mean of the sample. First, an initial guess for T and s are calculated from the sample. For this study, the initial location estimate is the median, and the initial scale estimate is 1.5 times the median of the absolute deviation from T , i.e., s = (1.5 Xmed(xi -TI) (it is more robust to non-Gaussian distributions than the sample standard deviation; see [4] ).
These values were used in (2) and (3). An updated biweighted mean T,,, was calculated from (3). Then T,,, is compared to T. If T,, is close to T , i.e., IT,,, -TI < 5 X lo-', the iteration is stopped; otherwise T,,, is set to T and a new T,,, is found by (3) .
All data points that are greater than c standard deviations from the biweighted mean are assigned a weight of zero and have no influence on the calculation of T. Notice that c = 00 corresponds to T = X, the sample arithmetic mean. The choice of c is less critical than the choice of the functional form of the estimator. In this study c was assigned the value of 6 because the resulting biweighted mean Tis nearly as optimal as X if the underlying distribution is Gaussian, yet performs well on non-Gaussian (e.g., contaminated Gaussian such as that assumed for some of the electrical parameters, for instance, contact resistance, considered in this study) distributions. A sensitivity analysis of the biweight efficiency for other values of c between 4 and 9 and the biweight's performance on Gaussian as well as non-Gaussian situations can be found in [3] .
C. Trimmed Mean
The fourth estimating procedure applied to the samples was to trim the sample symmetrically. This was performed by excluding a of the largest and a of the smallest data values from the sample, where a is the fraction trimmed on each side. If an is not an integer, fractional weights are assigned for the extreme observations of the trimmed sample. The mean of the remaining data points or "trimmed mean" was then calculated.
The trimmed mean is defined formally by the following equation:
-a n n number in sample [ a n ] the integer part of the product an.
For this work a = 0.2, resulting in a 20-percent symmetrical sample trimming. A 20-percent trimmed mean was chosen because there was a special interest in symmetrically trimming one observation from each end of a sample of size 5. Like the scale multiplier c for the biweight, different values of a will not affect the conclusions substantially. A sensitivity analysis of the a-trimmed mean for values of a between 0.10 and 0.25 and its performance for a broad range of symmetric distributions can be found in P I .
IV. STATISTICAL PROCEDURIE

ESTIMATOR FIGURE OF MERIT DETERMINATION
To distinguish the performances of the estimators, their variances are compared. The estimator which predicts the true mean without much variability (i.e., <10 percent) in the estimator would be preferred to one that greatly varies over the N randomly selected samples. Let Di(n) be the computational result of one of the four estimating techniques (i.e., arithmetic mean, median, trimmed mean, or biweighted mean) applied to the ith sample of size n ,
where N is the number of replications of samples of size n. Then the variance of a particular estimator denoted by D based on the N samples of size n can be estimated by the usual formula for the sample variance [6] .
where The variance of any of the four estimators will decrease with n , the sample size, irrespective of any other property of the estimator. This variance will depend also on the underlying variance of the population. To evaluate each estimator's performance, these two properties of the variance are taken into account by defining the figure of merit or statistical efficiency of an estimator as a function of sample size n Even though this definition can be justified by considering the efficiency for a Gaussian population [4] , it can be applied to populations having other distributions as demonstrated in [5] .
To determine the effect of sample size or1 an estimator's figure of merit, the entire wafer was patterned with test chips. Then, several samples of varying sizes were drawn from this population of test chps, and the estimators were calculated. Letting M the number of test chips on the wafer (in this study n the size of the sample drawn from the M test chips N the number of replications of samples of size n the following procedure was followed:
1) The true mean p and true variance u2 true were determined for an electrical parameter over a wafer containing only test chips. Section V describes the technique used to determine the true mean and true variance.
2) One hundred replications of samples of size n , y1 = 1,2,. . . ,20 were generated from this set of M measurements via a constrained randomized sampling plan that was used to simulate a variety of sampling plans that would be used to sample test chips in industry. This sampling technique is described in Section VI.
3) The median, arithmetic mean, biweighted mean, and 20-percent trimmed mean were calculated for each of the N = 100 replications and were used as estimators of the true parameter p. The calculation of these estimators was defined in Section 11.
4)
The variance of each estimator of the N = 100 samples was calculated and compared to the true variance. The effectiveness of each estimator was determined by calculating a figure of merit E ( n ) described in Section 111 associated with each estimator.
5) The figures of merit for all four estimators for a particular electrical parameter are plotted against sample size. The estimator that provides the highest efficiency as a function of sample size can be determined from these plots by using criteria such as relative magnitudes and slopes of the figures of merit E ( n ) versus sample size ( n ) .
From such graphs, it is then possible to determine which estimation technique yields the greatest figure of merit for that particular parameter. Also, for a given estimator, the graph indicates a sample size for which the figure of merit is greatest. These plots for different parameters are discussed in Section VI.
In order to characterize the wafer, it is important to determine initially the true parameter variation over the entire integrated circuit wafer. This can be obtained from measurements on wafers completely patterned with test chips containing microelectronic test devices. The data are examined to identify test results from defective structures which do not accurately represent the parameter being measured. Data are excluded from the main data set if they can be identified as results from a defective test structure.
Data are eliminated for the following conditions: 1) the data are known to be erroneous based on previous knowledge (for example, a test structure which has been physically damaged due to handling, etc.), 2) the test instrument indicates an open or short circuit compliance limit, or 3) the test results are beyond a known limit determined by experience or from results from a circuit simulation model (for example, a certain circuit is known to function only if the MOSFET devices comprising the circuit hz.ve threshold voltage values in some range).
The biweight described in Section I1 was used as an estimate of the true parameter distribution for the remaming population after data were removed for the abcve conditions.
It is possible that some data points do not reflect measurements from a defective test structure but are 2.pparently wild compared to the other data points. Kruskal [7] noted that such observations may provide important information about the distribution in question and should be included in the true parameter distribution. This information is lost if the observations are simply excluded from the population statistics. The biweight was chosen to define the true mean and variance because it, unlike several other estimating algorithms which have a finite probability of excluding a nondefective measurement, assigns a weight to these observations and includes them into the mean and variance calculations. Therefore, unless the observation is so extreme that it receives zero weight, these observaticas affect the true parameter distribution, and the potentially important information that they may provide is not lost entirely. However, for the populations in this study, the estimate given by the biweight was similar to the estimates given by other techniques for a given population (e.g., the technique described in [SI). Using the biweight in this application will affect only the vertical scale in the plot of figure of merit versus sample size since only the relative comparisons among estimators are of interest in this wo::k:. The use of the biweight, therefore, will not affect the conclusions.
VI. SAMPLING PROCEDURE
One hundred replications of samples of size n , n == 1,2,3,. . . ,20, were generated from the initial collection of data values of a particular electrical parameter over an entire wafer.
A constrained randomized sampling plan was used to simulate the variety of different sampling plans that would be used in industry to sample test chips on a productiom wafer. No attempt was made to determine or define an optimal sampling plan in this study. This sampling plan was based on the following two observations: 1) In industry, test chips usually are not fabricated or sampled on the periphery of the wafer which can be regarded as the first row and column of chips closest to the perimeter for small wafers (2 to 3 in) or the first two or three rows and columns closest to the perimeter for larger wafers (4 in and above). This is due to a larger density of defects and nonuniformities occurring in this area.
2) Test chips are not clustered in one sector of the wafer. Test chps are usually positioned on the wafer to provide the best wafer coverage.
To incorporate these assumptions without relying on a fixed sampling scheme, an algorithm was developed to generate samples that had data points taken from different regions on the wafer in the following way. First, data points for each sample were selected randomly (via a random number generator) from the true population of electrical parameter measurements. As each new data point was selected, its location was compared to the location of the previously selected points in the sample. If the difference between the coordinates was not a predetermined minimum number of test chips apart, this new data point was rejected and another point was selected randomly. Initially, the minimum number of test chips apart was set to eight. This number provided adequate spacing between sampled test chips for the 3-in diameter wafers used in this work. If after 50 random selections no new data point could meet this criterion, the minimum spacing was decremented by one and the procedure was repeated. By using this procedure, all data points in each sample were selected from different spatial regions on the wafer. For larger wafers having these same test chip dimensions, different spacings may be required to maintain the same wafer coverage as for the 3-in wafers. As mentioned above, the sampling technique used is not entirely random but is selected to represent various sampling plans used in industry. Constrained sampling plans have occurred in the statistical literature to evaluate statistical processes in other situations (e.g., [9] ).
VII. EXPERIMENTAL DETAILS
A. Integrated Circuit Wafer Description
The data used in this work to demonstrate the four estimating techniques were taken from test structure measurements made on wafers completely patterned with NBS-designed test chips and fabricated in several different fabrication lots from two manufacturers. The integrated circuit process used to fabricate the wafers was a radiation-hardened, silicon gate, CMOS/SOS process. A detailed description of the design rules and chip layout can be found elsewhere [lo] .
B. Test Structure and Measurement Technique Description
The p-and n-channel MOSFET threshold voltages were measured on a four-terminal MOSFET test structure having a designed gate length of 10 pm and a designed gate width of 64'pm. The procedure used to determine threshold voltage was to locate the region of maximum slope of the ID (drain current) versus V, (gate voltage) characteristic
curve with a drain to source voltage of 0.1 V. Drain current was measured at gate voltage intervals of 0.03 V. Slope was calculated by finding a least squares fit of five consecutive points on the ID versus V, curve to a straight line. When the maximum slope was detected, a straight line was extrapolated from this region to I D = 0. The intercept of the gate voltage axis was then taken as the threshold voltage. This procedure was determined to have a precision of & 0.003 V, where the precision was defined as one standard deviation of the mean for a population of identical measurements.
Contact resistance of metal to conductive layers was measured from a four-terminal Kelvin type structure with the current taps separated from the voltage taps. This measurement eliminated the probe to probe pad contact resistance and the series resistance of the metal layers connecting the probe pads to the voltage taps. Further description of this structure can be found elsewhere [ll] .
C. Testing System Description
The system used to test each of the wafers consists of an Accutest' model 3000 parametric test system, two Electroglad model 1034 XA6 automatic probe stations and two terminals. The test system is configured with current-voltage supplies, capable of forcing and measuring, voltage forcing supplies, a picoammeter with a resolution of 0.1 PA, a microvoltmeter with a resolution of 1 pV, and various other testing instruments.
VIII. SIMULATION RESULTS AND ANALYSIS
This section presents results when the method was applied to electrical measurements for critical process parameters for CMOS processes. The following interpretation of the results is made without considering any test chip sampling size cost criteria.
A . Comparing Figure of Merit on Several Wafers
A plot of estimator figure of merit or statistical efficiency versus sample size was generated for each of the electrical parameter samples from the nine wafers. Efficiencies for each of the four estimators were plotted on a single graph to facilitate comparisons among estimator performances.
To obtain some measure of average performance across several wafers, a measure of "normalized figure of merit" e i j k ( n ) for each parameter distribution k, for estimator i on a given wafer j based on samples of size n , was calculated first as follows 'Certain commercial equipment, instruments, or materials are identified in this paper to adequately specify the experimental procedure. Such identification does not imply recommendation or endorsement by the National Bureau of Standards, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose. where 1 6 j 6 9 l < i < 4 1 6 n 6 2 0 l 6 k 6 6 .
The normalized figure of merit for each wafer was averaged over the nine wafers. The resultant measure of the average performance of estimator i for the k th parameter distribution over the nine wafers was calculated as
B. Interpretation of Results
Four normalized average plots summarize the estimators' performances on the following electrical parameters over nine integrated circuit wafers: 1) n-channel MOSFET threshold voltage (V) (Fig. 1). 2) p-channel MOSFET threshold voltage (V) (Fig. 2). 3) p-and n-channel MOSFET transconductance (PA/ 4) contact resistance ( a ) (Fig. 4 ). V2) (Fig. 3) .
In these plots, the discrete points M , E , B, T indicate the value of Fik( n ) for the mean, median, biweight, and 20-percent trimmed mean, respectively, evaluated at 19 values of n , 2 6 n 6 20. To aid in the determination of the general trends of the plots, curves are drawn by fitting a sixth degree polynomial via least squares to the points. Fig. 1 shows such a plot for n-channel threshold voltage on a particular wafer designated D -3. The comparatively low figure of merit of the ordinary arithmetic mean s obvious from this plot. The arithmetic mean can be very highly influenced by unusual values of the parameter, which are not representative of the majority of the values on the wafer. This poor efficiency indicates that the urlderlying distribution of this electrical parameter is probably not Gaussian, for the arithmetic mean is known to be highly efficient on Gaussian data [6] . The simulation shows that the presence of even a few samples containing one cr CONTACT RESISTANCE more of these observations causes a highly variable estimator, since the arithmetic means from these samples are far from the true mean. The other three estimators (median, 20-percent trimmed mean, and the biweight) have comparable performances. For 4 6 n 6 8, the median performs slightly better than the other two; for 9 6 n G 20; the biweight supersedes all others.
For p-channel MOSFET threshold voltage (Fig. 2) , the trimmed mean attains the maximum figure of merit over all sample sizes, followed very closely by the biweight. Because the p-channel threshold voltage measurements across the nine wafers in this study are fairly stable, the inefficiency of the sample mean is not substantial, but it is definitely less efficient than the other three estimators. Also, the figure of merit of the trimmed mean does not increase substantially beyond n = 12. The estimator performance on p-and n-channel MOSFET transconductance (Fig. 3) is essentially the same. Fig. 4 shows the average figure of merit Fj(n) of the estimators of contact resistance on all nine wafers. In this study t h s parameter, like n-channel threshold voltage, was more variable than the p-channel threshold voltages in Fig.  2 . The sample means give figures of merit much smaller than those given by the other three estimators. The median and the trimmed mean are better than the biweight and all three are substantially better than the arithmetic mean. Again, only a slight gain in the figure of merit is noticeable beyond samples of size n = 10 to 12.
Finally, the average over all four types of parameters measured on all nine wafers is shown in Fig. 5 . This plot reveals that the trimmed mean, the biweight, and the median are all roughly comparable in estimating the aver- C. Summary This paper described the procedure for evaluating four statistical estimation techniques of the mean ( p ) of an electrical parameter on a wafer. The procedure may be applied in other situations requiring different values of M (total population size), n (sample size), and N (number of replications in the simulation).
In summary, the results of this work are: a) A method has been presented for relating the test chip sample size to a figure .of merit for a given estimation technique.
b) This method can be used in conjunction with cost criteria for test chip sample size to determine the most cost-effective test chip sample size for a given cost structure and process.
c) The method aids the user in making judgments based on numerical simulations concerning the relationships between test chip sample size and figure of merit for each estimation technique suitable for the intended purpose.
In order to demonstrate the technique, this method has been applied to electrical measurements of critical process parameters for CMOS processes. The results are interpreted without considering any test chip sampling cost criteria. These results indicate, subject to the criteria given in Section IV, that:
a) The arithmetic mean can be a highly variable estimator of average tendency of the parameter across the wafer. It provides the lowest figure of merit of the four estimation techniques studied when averaged over all of the electrical parameters.
b) The median, 20-percent trimmed mean, and the biweight have much less variability in their distributions about the true values.
c) The efficiency of these estimators are judged to become relatively constant at samples of sizes greater than 9 to 12. For sample sizes less than 9 to 12, the estimator figure of merit is a rapidly increasing function of sample size, and this dependence must be considelred when selecting test chip sample sizes in this range.
d) For individual types of parameters, specific estimators can be determined from Figs. 1-4 ; as an average for all types of parameters, the 20-percent trimmed mean provides maximum figure of merit and computational ease. e) Although the results in this study are based on measurements from different CMOS/SOS process lots fabricated by several independent vendors, similar results obtained from a p-well bulk CMOS process suggest that this procedure can be applied to other technologies.
