Abstract-In this brief, we first introduce a process-variationaware test-point generation method. With this method, faults are not obscured by process variations and we are able to generate new test points by measuring a very limited number of current values on-chip and estimating values of the remaining currents. We furthermore introduce a multiple-fault diagnosis procedure where we use the process-variation aware test-point generation method. The proposed methods can also be used for structural test. For the application, we have used a thermometer coded current steering digital to analog converter, as they are widely used due to their suitability for high speed applications and the symmetric design is suitable for the application of our method. We introduce a design-fortest hardware for the diagnosis cost reduction, while implementing our methods. Experimental results show that parametric errors as small as 20% can be diagnosed with up to 97.8% accuracy.
D
UE to increased application demands, the resolution requirements are increasing. These increased requirements augment the size of the circuitry, increasing the difficulty and time in detecting the location of defects and testing circuits.
The thermometer-coded circuits, although preferable for high-resolution applications, reduce the controllability of the circuit as each bit increment sums the current of a new current source with the previous ones. Once one of the current sources is faulty, the fault shows at the output current for each following code increment. The diagnosis of a faulty current source and the structural test of the circuit usually are handled by exhaustively incrementing the input bits from 0 to for a bit converter and observing the differences of output current between each bit increment. This approach requires quadratic diagnosis and test times. As bit numbers increase, a fast way to diagnose and structurally test for faults is necessary. The provision of such a methodology will enable pinpointing defects and alleviating a major burden on further increasing the bit number of a thermometer coded current steering (TCCS) digital-analog converter (DAC).
Functional (specification) test of a chip does not provide significant insight as to the reasons of failure. The diagnosis of a fault, on the other hand, not only provides the ability to correct for design flaws, but also enables debug of manufacturing defects. 1 Similarly, structural test provides valuable information regarding the manufacturing type of defects that may not show up in the specification tests. The decrease of structural test and diagnosis time using architecture specific design-for-test (DFT) techniques is of high importance in testing.
The consideration of process variations in test point selection is important for detectability of small parametric faults. If process variations are not taken into account, some of the parametric faults can be mistaken as a routine process variation. It is necessary to differentiate between a process variation and a parametric fault for increased detection accuracy.
In this brief, we bring a solution to the multiple-fault diagnosis problem by building on top of what we have presented in [1] . We are targeting three problems. The first problem is how to increase detection capability using process-variation aware test values. The second problem is how to solve the controllability problem of TCCS-DACs and enable structural test. The last problem is how to diagnose for multiple parametric faults. We give a motivational example for the process-variation aware test point generation. Other than the multiple-fault extensions, we also have conducted two more experiments, showing how the detection rate of error changes as some of the test parameters are perturbed. These experiments will be valuable to decide on the optimal diagnosis configuration for a circuit. Consideration of parametric faults, increasing the accuracy of detection and low-cost test and diagnosis techniques are highly important in converter design. The techniques presented in this brief can help refine designs faster and point out manufacturing defects with a low cost in terms of time and hardware, furthermore presenting the possibility of built-in implementation. This low-cost technique can also help build higher resolution DACs as proper diagnosis and testing are fundamental means towards developing improved circuits.
II. PREVIOUS WORK
Relevant research in the area of current-steering architectures can be summarized as follows. [2] has used a current steering architecture to extract mismatch parameters. [3] has introduced an error estimation and reconfiguration technique for converters, where DNL is used to correct for the integral nonlinearity error. [4] has introduced models for the mismatch in the current sources of a CS-DAC. [5] has analyzed the effects of mismatch on timing errors in a CS-DAC.
In terms of built-in self test, [6] has emphasized the importance of structural tests in converters. [7] has used an oscillation-based testing method. [8] has used a DSP-based test. [9] has used a digital circuit to test an analog-to-digital converter (ADC). [10] has implemented design for testability for pipelined ADCs with correction. They have enabled the detection of faults that could escape due to the correction circuitry. [11] has introduced a testing methodology for current-mode ADCs. [12] has proposed an ADC BIST based on code width and sample difference testing. [13] has proposed a calibration technique using correlation-based successive coefficient measurements. [14] has implemented a calibration technique to measure timing skew by randomly alternating the sample sequence.
For diagnosis, signature analysis has been used in [15] . [17] has used sensitivities for fault diagnosis. [16] has used learning vector quantization, a machine learning algorithm, for diagnosis of analog circuits. [18] has used the differential nonlinearity as a measure to diagnose the faults in a flash ADC. [19] has used time division multiplexing, scan-based testing and voltage controlled oscillator-based measurements for the diagnosis of a pipelined ADC. [20] has used Short-Time Fourier Transform on nonstationary signals to diagnose multiple-stage ADCs. [21] has proposed a diagnosis methodology for sub-ranging ADCs using physical modeling of INL. [22] has introduced a method to choose basis functions as phase-plane error functions for the diagnosis for ADCs. [24] has used neural networks to diagnose for converters. Little has been said in particular for the diagnosis of current steering DAC architectures, nor diagnosis of multiple-fault in the presence of process variations for testing.
III. MOTIVATION

A. Motivational Example for Process-Variation Aware Test
Assume a 3 by 3 matrix of current sources in a converter laid out in the layout. Each source is designed to output nominally equal currents of 1 unit. The test bounds are assigned as [0.85,1.15] to each source to be able to cover the inherent process variations, as well as other parametric faults, which can change the output value. A source is tagged as faulty if its output is outside this range. This is the traditional way of assigning test bounds to each source, and here is the problem with this assignment. If there is a parametric fault which changes the output of a current source by an additional , most faults can still be detected. But, a 5% parametric fault leaves most of the current sources still in the valid region, hence small faults would go undetected as they are obscured by process variations.
It would be desirable to estimate the expected process variations using the measured values of a few number of the sources on-chip, and then set the limits for faults on top of these variations. In this brief, we show how this is possible.
B. Motivation for Cost Reduction
The core of a current-steering architecture is current sources. Current sources are connected to the output branch according to the digital code input to the system. In a binary-coded system, each current source is designed to supply current that changes in orders of 2. It becomes hard to avoid major errors, as errors for bits with large weights get magnified at the output as compared to bits with small weights.
In a thermometer-coded DAC, on the other hand, each current source is designed to supply equal current. It becomes easier to match current sources using layout techniques and hence obtain lower integral nonlinearity (INL), differential nonlinearity (DNL) errors and offset at the output. Furthermore, the thermometer code introduces the possibility of error correction.
On the other hand, the thermometer code brings a controllability problem. Incrementing the binary digital input code to the DAC by one least square bit (LSB) at each step, a new current source is switched on and is connected to the output branch. As a result, a faulty current is always present at the output as compared to the case in a binary converter. This means that the faults are cumulative in the output of a thermometer-coded DAC.
We could identify the faulty sources by sequentially incrementing each input digital code and measuring output. Observation of the difference in output current between two subsequent steps would enable the detection of the faulty current sources. However, this is too costly as for bit digital inputs, up to conditions may have to be tried in the worst case. As the resolution of DACs is increasing with technology, diagnosis and structural test time is becoming highly intolerable.
Testing is a subset of diagnosis. As long as a set of current sources is determined to exceed the margin, the device can be marked as faulty. On the other hand, in diagnosis, we need to determine which current sources are faulty. Through structural test, some of the functional tests might be skipped or structural tests can be used in addition to the specification tests for increased information about the device, as manufacturing defects can best be identified by employing a structural test methodology as opposed to functional testing. 2 It would be beneficial to reduce the diagnosis or structural test time by selecting groups of sources to be tested at once. This can be achieved using a DFT hardware.
IV. PROCESS-VARIATION AWARE TEST METHODOLOGY
As process variations are always present in a circuit, comparing current sources with nominal values obscures the presence of errors with small deviations. To overcome this problem, expected values for the test results need to be estimated by considering the process variations, and bounds for the parametric faults should be added on top of this estimation on each source. Hence, process variation-aware test points for each individual chip will enable differentiating the parametric faults accurately, improving correct diagnosis.
Process variations are usually provided with a statistical or probabilistic equation for current sources, either obtained from silicon or from simulations. For each particular chip implemented in silicon, each current source will have a particular value sampled out of the corresponding multi-variate probability density function. Identification of a method capable of measuring a very limited number of these current sources and the consequent estimation of the values of the remaining ones enable the incorporation of the process variation information in our test points with a small cost.
It is known that the current sources are correlated to each other due to their pair-wise distances on the chip. Yet statistically, these correlated variables can be individually represented as a sum of independent components through a technique called principal component analysis (PCA) . If the currents of the current sources are normalized, i.e., , then we can write , where is a vector consisting of normalized current source variables, is the eigenvector matrix and vector contains the principal components. 3 The normalization step is handled by decrementing the mean and dividing by the standard deviation of a current source.
As a small number of principal components is satisfactory in providing most of the variation; we therefore first determine the number of principal components that are necessary to account for the selected minimum variation percentage. Selecting a reduced number, , of these components in is equivalent to reducing the number of unknowns by deleting some of the columns in the matrix multiplication.
Then, of these equations (rows) are chosen so that an equation-unknown ( row-column) system is obtained. The selection is made from the top of the equation list so that values are able to be measured irredundantly by consecutively incrementing the input digital code. 4 While an alternative set of equations would result in values that are likely to be slightly different, the increased error is bounded by the selected minimum variation percentage. Minimization of the associated error requires a selection of equations where the sum of discarded values is minimal leading to a possible increase in number of required measurements.
For each row, values are measured on chip. The values, entries of , are calculated using the correlation matrix. Hence, only values are left to be determined. These reduced number of equations are then solved to find the values where takes values in the range of 1 up to the number of selected principal components. Using the remaining equations and substituting the determined values, all values can be calculated. Hence, it becomes possible to obtain all values within an error bound without measuring the currents exhaustively. Then, using the layout matrix, as we know where each current source is located, we can calculate the test nominal points for each source. These test points will be used for that particular chip during test. The error introduced by PCA should be kept below the test thresholds to be used. This can be achieved by selecting the number of principal components so that the error introduced is around 10 times smaller than the percentage bounds allocated to the parametric faults around the test nominal points.
V. LOW-COST MULTIPLE-FAULT DIAGNOSIS METHODOLOGY
Matching of the current sources in a CS-DAC is quite important to keep specifications like INL and DNL within given bounds. In order to provide a good match, a common centroid layout structure is almost always used. In this layout structure, matched components are dispersed in a matrix configuration on layout to average out the effects of process variations.
In thermometer code, MSB bits would require current sources. MSB and LSB current sources have separate matrices and are matched separately. Considering the MSBs as an example, the matrix size is by . The arguments above are also applicable to the LSB matrix by replacing with .
A. Multiple-Fault Diagnosis Formulation
The diagnosis procedure consists of checking the sum of each row and column of the location matrix, instead of exhaustive analysis of each current source. Let be the enumeration over current sources where and is the total number of current sources either in the MSB or LSB matrices. Let and denote the th column and row respectively. Let denote the current of the th source. Then, we can write and , where and are the th column and row current sums, respectively. Then, we propose (1) (2) where and are vectors which hold the column and row indexes of the faulty current sources, the function returns the indexes, , of number of sources for which the first argument to the function holds. is the number of faults to be detected, is the mean of all 's, is the standard deviation over all 's and is a rate which sets a margin to detect faults. Essentially, the function finds the number of the largest elements which deviate more than a given value from the squared distance with respect to mean. A similar step is handled within the row sums. Hence, the process necessitates activation and observation of current values. Letting , the process has a time complexity of , which is linear. Current practices, in contrast, exhibit time complexity.
B. Reduction of Possible Faulty Locations
The special case where is 1 corresponds to the single-fault situation. In this case, the vectors and consist of a single element each if a fault is found. In the multiple-fault case, these vectors include number elements, each. As rows and columns are indicated as faulty, positions could be faulty and has to be checked. To reduce the number of positions to check, we have modified the function such that if number of elements do not exceed the margin, it only returns the ones that exceed. Furthermore, it is reasonable to expect a full row or column to be faulty. In this case, either or has elements and the remaining one has one element in it, corresponding to a faulty row or column, respectively.
C. DFT Circuitry
A hardware that allows concurrent activation of all current sources on each row and column individually should be designed as required by the proposed diagnosis method. We propose such a hardware in Fig. 1 for the column decoding circuitry. A similar circuitry is also used for the row decoding circuitry, except that the column select input is inverted and input as the row select input. In Fig. 1 , the original DAC circuitry includes the column thermometer decoder. 5 The rest belongs to the DFT hardware. In the TCCS DAC, the thermometer decoder converts the binary input to thermometer code. If a row and column switch is turned on simultaneously, the corresponding current source is connected to the output.
In test mode, the thermometer decoder is bypassed by setting test select input to 1. If the column select input is 0, then the binary input selects one of the columns according to the input binary code. The row DFT decoder, with these inputs, selects all rows. Hence, eventually, only one column is selected and the sum of the current sources is connected to the output. The operation of the row and column decoders are complementary to each other, hence, when column select is 1, row select becomes 0 and one of the rows is selected.
VI. EXPERIMENTAL RESULTS
We have used layout information to extract the correlation matrix. The nominal currents are simulated using the netlist of a CS-DAC design. Variances of current sources are taken as 10%, as determined by worst-case transistor-level simulations of the individual current sources. The remaining calculations are performed using MATLAB R12.
For evaluating diagnosis accuracy, we have generated unit random principal components. Using the correlation matrix and PCA formulation, we have found normalized current source values. This mimics the effects of process variations on a particular chip. Generation of independent principal components simulates a new chip. We have de-normalized the normalized current source values by multiplying with their standard deviations and summing their mean values. We have generated parametric faults that range from 5% variation up to 50% variation. We call these variations from nominal as the error rate. Then, the diagnosis algorithm is run to detect the location of the faulty current sources. We have used principal components that can account for 98% of the whole variation. Only 6 principal components are found to be sufficient for 98% variation to estimate the remaining current sources. 6 
A. Experiment Set 1
Considering the increased size of matching matrices, the diagnosis procedure is repeated on 6-, 8-, and 10-bit matching requirements corresponding to 64, 256, and 1024 matched current sources, respectively. The true detection rate for a single fault is given in Fig. 2 and can seen to be close to 100%. 7 It can be observed that the diagnosis procedure is increasingly accurate as the number of matched current sources increases. This is the result of averaging more current source column or row sums and being able to differentiate this average from the faulty one more efficiently. Increasing the fault rate makes the 6 Notice that the 2% error is much less than the error rates used. 7 Detection rate shows whether a current source diagnosed as faulty is actually the faulty one. diagnosis easier. It can be observed that even at 20% error rate, we are able to attain 99.8% detection accuracy. This implies that any fault exceeding this error rate is also detected at least at a 99.8% detection rate. 8 
B. Experiment Set 2
Given , we have observed how the detection rate changes as the error rate is decreased from 50% down to 5%. The results are shown in Fig. 3 for 2-4 faults. We can observe that the detection rate decreases as we increase the number of faults or decrease the error rate. Furthermore, as we increase , detection rate starts to decrease at a higher error rate, indicating that higher causes some of the small faults to escape detection.
C. Experiment Set 3
Given error rate, we have observed how the detection rate changes as is increased from 0.2 up to 3. The results are shown in Fig. 4 for 2-3 faults. We can observe that detection rate rolls off after a certain value is exceeded. Increasing the number of faults or decreasing the error rate decreases the roll-off point. We can conclude that an around 0.5 is a reasonable choice to maximize detection.
VII. CONCLUSION
We have introduced a process variation-aware test point selection method to reduce the obstruction caused by process vari- ations over parametric faults. We have provided an approach for multiple-fault diagnosis and structural test of thermometer coded DACs, along with a method to reduce the test and diagnosis time from quadratic to linear. We have used a DFT hardware to implement the proposed method. The proposed low-cost technique will enable the development of converters of higher resolution by being able to locate and test defects by the proposed multiple-fault diagnosis and structural test procedures, respectively.
APPENDIX PCA
PCA is a statistical method to write a set of correlated parameters as a linear function of noncorrelated ones. The noncorrelated parameters are lower in number as compared to the correlated parameters.
Although most books would give a detailed analysis [25] , we herein provide a practical intuition of PCA. As the function is linear, we can relate this method to linear algebra. Let the correlated variables be in vector . Let be the matrix consisting of the eigenvectors of the pair-wise correlation matrix. Let us write the equation . Here, is a vector of dimension and is the transpose operator.
With this form, can be acquired through and . The largest eigenvalue of corresponds to the variable in that provides the largest variation in data. Hence, of variables in , which correspond to the largest eigenvalues, are satisfactory in accounting for most of the variation. The ratio of the sum of the selected eigenvalues to the sum of all eigenvalues can be used to ensure that the error in vector is minimal. Hence, reducing the variables in to a number of , becomes a matrix and becomes a vector. The variables in are called the principal components.
