Silicon complexity places long-stand paradigms at risk. Key concerns include increasing process variations, defect rates, infant mortality rates, and susceptibility to internal and external noises. These trends are likely to decrease functional yield. Fabrication of die with 100% working transistors and interconnections becomes prohibitively expensive. This paper examines the size and the position of the candidate part of the architecture for defect tolerance application, for the given topology and defect probability where yield can be improved in comparison to error tolerant design. In order to achieve the mentioned goal, we modified the existing mathematical description of yield by involving error tolerant concept introducing a function Γ (α) that models the topology of architecture. The evaluation is demonstrated on the bit-plane semi-systolic array, as a relatively complex array topology. The method that we hereby present for the chosen topology is described and proved in formal mathematical way, and it easily covers simpler topologies. It will be shown that partial involvement of defect tolerant design can significantly improve effective yield for defect rates which are common in nanotechnology.
Introduction
As device size continues to shrink, it approaches the scale of individual atoms and molecules. With atom spacing in a silicon lattice around 0.5 nm, 65 nm drawn features are a little more than 100 atoms wide. Key features, such as gate lengths, are effectively half or a third this size. Continued geometric scaling will take us to the realm where feature sizes are measured in single-digit atoms sometime in the next couple of decades (see [1] ). Key concerns include increasing process variations, defect rates, infant mortality rates, and susceptibility to internal and external noises (see [2] , [3] ). These trends are likely to decrease functional yield. As VLSI scaling continues along its traditional path, we will soon be in a situation where chips will have billions of devices and thousands of defects (see [4, 5] ). Fabrication of die with 100% working transistors and interconnections becomes prohibitively expensive (see [6] ). While scaling approaches the physical limits of devices and fabrication technology, designers will increasingly have to consider qualitative changes.
Fault tolerance (FT) is the ability of a system to continue correct operation of its tasks after hardware or software faults occur (see [1] ). Correct operation typically implies that no errors occur at any system output. Other FT definitions replace the word correct with satisfactory or reliable. Defect tolerance (DT) refers to any circuit implementation that provides higher yield than an implementation that is not defect tolerant, for a given level of defects and process variations. Enhancements in this category include redundancy (often in the form of spares) as well as defect avoidance, in the form of layout and circuit design techniques that reduce circuits' sensitivity to fabrication defects and process variations (see [1, 6, 7] ).
Silicon complexity places long-stand paradigms at risk (see [6] ). The concept of building useful computational systems with parts that might be initially defective, experience externally induced transient errors, or eventually develop a permanent lifetime fault is not new. Researchers addressed these problems as far back as the 1940s with the work of von Neumann, Gödel, and Klein, continuing with the emergence of fault tolerant computing in the 1960s and, more recently, the field of defect tolerance (see [8] ). In many applications, however, certain types of errors at the entire systems outputs might be acceptable, provided their severities are below the given thresholds (see [6, 9, 10] ). Such systems are called Error Tolerant (ET) systems. Multimedia applications are one example of ET systems. In multimedia, designers take advantage of the signal processing ability of people to convert the original source of signals to lower quality packets of information, since this usually provides acceptable performance to the end user, reduced bandwidth and hardware costs. An interesting question is: if some signal processing device has a minor hardware defect, will it still produce results that are good enough for the end user? If so, they could also be sold rather than be discarded (see [9] ). Relaxing the requirement of 100% correctness for devices and interconnections may dramatically reduce costs of manufacturing, verification, and testing (see [2] ).
Defined by output error thresholds, the most significant part of ET systems can be further designed to be DT, resulting in Partial Defect Tolerant (PDT) systems, in contrast to Full Fault Tolerant (FFT) systems, where DT (or FT) design is applied to the system as a whole (see [10, 11] ). As stated in [10] , PDT design is preferred in comparison to FFT design. However, the introduced silicon overhead for PDT system implementation may not improve fabrication cost per die when defect probability is small. The reduced number of rejected defective chips can be so small that the introduced silicon overhead for PDT implementation is not justified. The goal of this paper is to examine the size and the position of defect tolerant part in the array, for the given topology and defect probability, where yield can be improved in comparison to error tolerant design. In order to achieve the mentioned goal, we modified mathematical description of yield from [1] by involving error tolerant concept introducing a function Γ (α) that depends on array topology. The evaluation will be demonstrated on the bit-plane semi-systolic array, as a relatively complex array topology. The method that we hereby present for the chosen topology is described and proved in formal mathematical way, and it easily covers simpler topologies. In the paper, dependency between the yield that can be achieved using PDT array and defect probability is mathematically formulated and compared with ET design. The size and position of candidate array partition for defect tolerance application will be given as a function of acceptable error magnitude, and used in yield evaluation. It will be shown that PDT design can significantly improve effective yield for defect rates, which are common in nanotechnology.
The paper is organized as follows: Section 2 gives a brief architectural and design overview of the PDT bit-plane array, Section 3 is devoted to the yield analysis for partial defect tolerant design, Section 4 discusses error significance of the bitpane topology, in Section 5 we give the evaluation of the partial defect tolerance at the example of the bit-plane array, while in Section 6 the concluding remarks are given.
Partial defect tolerance
With the aim to clarify the yield analysis, we give a brief overview of the PDT and illustrate the basic concepts using the example of bit-plane FIR filtering array.
Output words { y i } of an FIR filter are computed as
where c 0 , c 1 , . . . , c k−1 are coefficients while { x i } are input words. Computation (1) can be realized in different manners.
When high performances are required systolic arrays are frequently used. Semi-systolic arrays share with systolic arrays not only the desirable simplicity and regularity properties, but also pipelining and multiprocessing schemes of operation. Bit-plane (BP) FIR filter is a semi-systolic architecture with bit-plane operations (see [12, 13] ). It provides regular connections with extensive pipelining and high computational throughput (see [12] [13] [14] ). A functional block diagram of a BP array is shown in Fig. 1 . The following notation is adopted: m -coefficient word length; k C -number of coefficients There are m bit-plane elements that form the array shown in Fig. 1 . Each BP ( Fig. 1 ) is formed as a set of k C rows. A row performs the basic multiply-accumulate operation between the intermediate result from the previous row and the product of the input word and one coefficient bit. Delayed for one clock cycle per row, the output word is available after k C · m clock cycles (see [12, 13] ). Fig. 2 shows the bit-plane array from Fig. 1 redrawn in such a manner that all connections between cells become regular (see [10] ). Regular connections enable simplification of error significance maps development for the array (see [10] ). Regularity is achieved by introducing a set of fictive nodes, represented by circular nodes in Fig. 2 . In order to clarify the PDT architecture design, we will define areas of interest for PDT design and illustrate them on the array from 
Definition 1 (Error Tolerance).
The error tolerance is a design concept which allows the architecture to have defects that can produce an error ∆ below the given threshold ψ at the system output
In the previous α is the number of the most significant output bits not prone to errors.
Fabricated dies with errors, such as ∆ ≤ ψ, can rather be used than discarded. However, for given ψ, the errors where ∆ > ψ cannot be tolerated.
Definition 1 represents Euclidean distance between correct and erroneous output signal, and shows the architecture tolerance to errors. However, Euclidean distance, in the case of the BPA from Fig. 2 , can be misleading.
Euclidean distance is a suitable metric for the abstraction of signals as error sources. For example, if the magnitude of the correct output signal is 011 · · · 111, and the error within a signal is small, eq. 000 · · · 001, the output result will be 100 · · · 000. However, this seems to be an error in all bits of the output result. Observing the particular cell from Fig. 2 as an error source, the cell has bounded influence on the output result. The boundaries of influence are defined by topology of interconnections, thus, the metric that can abstract the error as a function of position is Hamming distance, rather than Euclidean, which is defined as
where y i corr and y i err are the bits with weight 2 i from correct, and erroneous result, respectively.
Definition 2 (Minor Defect).
We assume that architecture has a minor defects, and that it produces the acceptable output result if, for given α, it contains errors for which
For a given architecture the parts of the system that are not prone to errors are defined by the specific application (see [6] ).
Definition 3 (Error Significance Set).
The set of cells, or more general, the set of subsystems of the architecture, that can induce an error in the output y η , is termed the error significance set of the output y η and is denoted M η (see [10] ).
As given in Definition 2, α(α ≤ L) is the number of the most significant outputs of ET system not prone to errors Fig. 2 ).
Definition 4 (PDT System).
Partial defect tolerant system is an ET system where subsystems, which can induce error (∆ H > 0), are defect or fault tolerant (see [10] ).
Definition 5 (Architecture Partitioning).
The Non-Tolerant Area (NTA) of the architecture is the function P DT from the set {0, 1, . . . , L − 1} into the system subsets, so that
The rest of the system is called Error Tolerant Area (ETA), and is denoted by (P DT ). The shaded area in Fig. 2 shows the NTA of the bit-plane array for α = 2, while the non-shaded area represents the ETA.
In accordance with Definitions 4 and 5, ET system becomes PDT system by making NTA defect tolerant.
Error tolerance vs. partial defect tolerance
Let C be the fabrication cost of one defective or non-defective die, and let p be a probability of having a defective subsystem. We call a die ''usable die'' if it is non-defective, or has minor defects, in respect to Definition 2, which can be tolerated by the application. In order to compare the error tolerant and partial defect tolerant design methods, we define the price per usable die.
Definition 6 (Price Per Usable Die).
For a fabricated die without defects or with a minor defect, in respect to the Definition 2, the price per usable die is a function U(p, α), from the set {(p, α)
where u (u ≥ 1) is the factor that depends on defect probability p as well as the system geometry.
If the probability of having defect is p = 0, the price for fabrication of usable die is equal to the price of production of one die, i.e., for p = 0 we have u = 1 regardless of geometry. However, if the probability of having defect is p = 0, the price of fabrication of one usable die is greater than C . We call the factor u from Definition 6 fabrication yield.
Definition 7 (Yield). We define fabrication yield Y of the design as
.
According to Definition 6, the yield is defined as a comparison parameter scaled into the interval Y ∈ [0, 1], where Y = 1 means that there are no defective dies, and Y = 0 indicates that there is no usable die. Value Y = 1/2 tells that 2 dies should be produced in order to have 1 usable die.
Let Γ (α) be the probability that the given subsystem belongs to NTA, where α is the number of the most significant outputs required to be error-free. Straightforward calculation gives the following simple lemma.
Lemma 1. If T is a total number of subsystems, and p is probability of having a defective subsystem, the cost for fabrication of one non-defective die is
U ET (p, α) = C (1 − p) Γ (α)·T . (4)
Lemma 2. Yield of ET system is:
Proof. From (4) and Definition 7, Eq. (5) is obtained directly.
In order to make the NTA more prone to defects, PDT design requires replacement of system's submodules with DT modules in the NTA area, thus new system cost C differs from the cost of ET system, variable C in (4). Furthermore, depending on the chosen DT method, the probability of having a non-defective cell is more complex than (1 − p), as given in (4), and it is denoted as R.
Lemma 3. The cost of a usable die designed using PDT architecture paradigm is
U PDT (p, α) = C R Γ (α)·T . (6)
Lemma 4. The yield of PDT system is
Proof. Eq. (7) is derived using (6) and Definition 7.
We illustrate the previous concepts using the following example. Let the chosen defect tolerant method in the NTA system area be Spare Components with 2 spares (SC3), see [1, 7] . Then, instead of having one cell in NTA, there are three cells, i.e., the cell and two spares. The overall cost of the system is
while the probability of having non-defective SC3 DT cell is equal to the probability that there is at least one (of two) defectfree spares (see [6, 7] ), if the ''original'' cell is defective
Substituting (8) and (9) into (7), we get the yield of PDT system. The yields of ET system (5) and SC3 case of PDT system (7) are shown in Fig. 3 . It is assumed that T = 168, as in Fig. 1 , and Γ (α) = 0.5. Fig. 3 shows that both yields decrease when the probability of having defective cell increases. For the case from Fig. 3 , when p = 0 yield of PDT system is Y PDT = 0.5. This comes from the fact that we involved the additional spare components in Eq. (8), so the price per die, for given parameters, get doubled. At the same time, the yield of ET system is Y ET = 1, meaning that there is no additional overhead, and for the given probability the price per usable die equals the price of producing one die, as defined in Definition 7. However, there is p from which the yield of the PDT system becomes more preferred than ET. That probability depends on the given system's geometry, which shapes the function Γ (α) and the degree of error tolerance of the application α.
As an illustration, Table 1 gives the costs per usable die for yields shown in Fig. 3 . If the probability of having defective subsystem is 0.003, in the case of ET, 1.287 dies should be produced to obtain one non-defective die. For the same probability, 2.004 dies should be produced in the case of PDT, making the ET design preferred. If the probability p is greater than 0.008, PDT becomes preferred. The shaded area in Fig. 3 shows where PDT is preferred.
Error significance of the bit-plane array
In order to define the intersection of Eqs. (5) and (7), and to calculate the probability starting from which the PDT improves yield for bit-plane array, the function Γ (α) has to be obtained.
Theorem 5. The probability that the given subsystem belongs to the NTA of the bit-plane array is given by
where ℵ(A) represents the cardinal number or cardinality of a set A (see [15, p. 28 
]).
Proof. From (3), function Γ (α) can be obtained as follows
where T is total number of array cells (
The sets M η in (3) have common elements (Fig. 2) . Henceforth, in order to calculate the cardinality of (3), from the cardinality of the individual sets, that fact has to be considered. The function ℵ(P DT (α)) can be obtained from (3) as follows where A \ B denotes the set with elements which belong to the set A, and does not belong to the set B. In other words, ℵ(P DT (α)) equals the number of array cells that can induce error in the most significant output bit y L−1 plus the number of the array cells with influence to the output bit y L−2 without cells that are already taken into consideration. Cardinality of individual sets M η can be obtained from transitive closure of a directed graph that represents the architecture from Fig. 2 . Using results from [10] , transitive closure of the bit-plane array from Fig. 2 is given by
where elements G C = {g
Let dimensions of the bit-plane array be k C = 2, m = 2, l 0 = 4 and L = 6. There are L · (k C + m + 1) = 30 nodes within the array, including output nodes. Hence, the transitive closure is a 30 × 30 matrix (example taken from [10] ). The transitive closure for such a small-size array is given in Fig. 4 . The 24th column (shaded column in Fig. 4 ) corresponds to the error significance set M 5 of output bit y 5 (see [10] ), which implies that the number of elements equal to 1 within the column is equal to the cardinal number ℵ(M 5 ). According to (12) we have ℵ(P DT (1)) = ℵ(M 5 ), which can be obtained from (13) and 
In addition to the array cells included in (15) , there are m · k C cells that should be included when calculating ℵ(P DT (2)), From (15) and (16), for α > 1, we have
Eqs. (15) and (17) give
Finally, substitution of (18) into (3) proves the theorem.
Let us illustrate the previous theorem. From (18) , numerical values for the cardinality of P DT (α), α = 0, 1, 2, . . . , 6, with m = 2, k C = 2, L = 6, for example given in Fig. 4 , are n i ∈ ℵ(P DT (α)), α = 0, 1, . . . , 6 ⇒ n i = {0, 14, 18, 21, 23, 24, 24}. 
Yield of the partial defect tolerant bit-plane array
Probability p starting from which PDT has greater yield than ET, for the bit-plane array, can be obtained as abscise of the point of intersection in Fig. 3 . Denote γ = Γ (α), then from (5) and (7)- (9):
Theorem 6. Probability p starting from which PDT has greater yield than ET, for the bit-plane array, is
Proof. We have to solve (20) for p. The equation is nonlinear. Therefore, in order to linearize the equation we denote Let B = e A , then
Solving (23), we get two solutions
Since A ≥ 0, we have B ≥ 1. Hence, solution with the minus sign is discarded because it gives negative probability. Let P(α, T ) be the function from {(α, T ) | α = 0, . . . , L, T ∈ N} to the probability p of the point of intersection from For the sake of illustration, the dependency between ET application's susceptibility to errors (α) and probability where PDT has preferred yield over ET is obtained for the example in Fig. 1 (T = 168) from (25), and is shown in Fig. 6 . The values at P(α, T ) axis are zoomed to emphasize the dependency, and are shown starting from value p = 0.0076 till p = 0.0082. It can be noticed that variations of α have very slight influence at the point of intersection of yields in Fig. 3 . For example, from Fig. 1 , with T = 168, the PDT has better yield in comparison to ET for probabilities of having defective cell greater than approximately p = 0.008, as shown in Fig. 6 . Fig. 7 shows P(α, T ) for different number of cells, for constant γ (γ = 0.5). It can be noticed that probability of having a defective cell for which PDT yield is preferred in comparison with ET yield, exponentially decreases with the increase of total number of cells T . From Fig. 7 , it can be concluded that the application of defect tolerance on the architecture partition can be well exploited in nanotechnology. In [1, p. 833] , it is indicated that ''for today's large chips with T > 10 9 devices, the defect rate p must be below 10 −10 to expect 90 percent or greater chip yield''. Therefore, in the case of BP topology, if the probability p increases only 10 times and reaches 10 −9 , or even more, which is common in nanotechnology, the yield falls significantly below the yield of the PDT system. In such cases, the PDT becomes the preferred design method. Furthermore, if probability p = 10 −8 , and T = 10 9 , the yield of ET system, according to (5) is Y ET = 0.0067, while the yield of PDT system, according to (7) , is Y PDT = 0.5.
Concluding remarks
In this paper the analysis of yield in the case of partial application of defect tolerance was presented. The size and the position of defect tolerant part in the array, for the given topology and defect probability, where yield can be improved in comparison to error tolerant design, were examined. We modified mathematical description of yield from [1] by involving error tolerant concept introducing a function Γ (α), which depends on array topology. The evaluation was demonstrated on the bit-plane semi-systolic array, as a relatively complex array topology. The method shown for the chosen topology is described and proved in a formal mathematical way, and it easily covers simpler topologies. It was shown that the partial defect tolerant design can significantly improve effective yield for defect rates common in nanotechnology. 
