Abstract-In this paper, we present an integrated computeraided design environment, the VAR (VHDL-based Array Reconfiguration) system, for the tasks of design, reconfiguration, simulation, and evaluation in an architecture modeled by VHDL. An easily diagnosable and reconfigurable two-dimensional defect-tolerant PE-switch lattice array is used as an example to illustrate the methodology of VAR. VAR allows the designers study and evaluate fault diagnosis and reconfiguration algorithms by inserting faults, which are generated based on manufacturing yield data, into the array and then locating the faulty PE's as well as simulating the reconfiguration process. Thus, VAR can assist the designers in evaluating different combinations of fault patterns, fault diagnosis algorithms, reconfiguration algorithms, and reconfigurable architectures through a complete set of figures of merit which aim at architectural improvements. Extensive simulation and evaluation have been performed to demonstrate and support the effectiveness of VAR. The results from this research can drive the applications of large-area VLSI or WSI (wafer scale integration) closer to reality and result in low-cost, high-yield array architectures.
I. INTRODUCTION HE VHDL-based Array Reconfiguration (VAR) sys-
T tem is an integrated high-level CAD environment for the tasks of design, diagnosis, reconfiguration, simulation, and evaluation on an architecture described in a hardware description language, VHDL. VHDL, the IEEE 1076 Standard, is the basis of the VAR system. It is used to model reconfigurable array architectures in a modular way, and its support environment is used to simulate and evaluate different design alternatives. One of the advantages of developing such a VHDL-based tool is making other VHDL-based tools available to VAR and vice versa. VHDL has been adopted as a standard hardware description language by the U.S. Department of Defense and is gaining popularity in industries and academia. We can use manufacturing yield data to generate different fault pattems to see if a simulated array has the capability to withstand the possible defects through fault diagnosis and reconfiguration. The above results along with the quantitative evaluation of the target array can help the designers determine appropriate redundancy deployment and allocation. The procedures of modeling, simulating, and evaluating processes are illustrated by an example twodimensional reconfigurable systolic array. The transformation of a faulty array into a target array in a VHDL support environment is used to simulate the actual reconfiguration process. The correctness of the reconfiguration process and the functionality of the target array are verified by performing matrix applications. Experimental results are included to demonstrate the evaluation process. The issues evaluated include redundancy overhead, efficiency of reconfiguration algorithms, fault distribution effect, array quality, yield, and reliability.
Although extensive research in the areas of fault diagnosis and reconfiguration for array architectures [ 11- [4] has been performed, integrated architecture-level computer-aided design (CAD) tools are lacking with respect to assisting the design and evaluation of defect-tolerant array architectures. Many physical level VLSI CAD tools, such as tools for layout design automation, are available today but few are architecture-level system design [5], especially for reconfigurable architectures. The design and analysis of reconfigurable VLSI array architectures are increasingly complicated tasks, especially in evaluating the impact of reconfiguration mechanisms on performance, overhead, and yield. It is necessary to develop such an integrated high-level CAD tool to assist in simulating and evaluating various combinations of fault patterns, fault diagnosis algorithms, reconfiguration algorithms, and reconfigurable architectures. This environment will enable system designers to reduce the design turnaround time, to pinpoint possible design problems in the early design phase, and to optimize their designs through architectural improvements. It will drive the applications of large-area VLSI or WSI (wafer scale integration) closer to reality and help produce low-cost, high-yield array architectures. In [6] the high-level structure of a CAD system for reconfigurable array architectures was outlined, but the au-0278-0070/92$03.00 0 1992 IEEE thors did not address how the modeling, simulation, and evaluation could actually be performed and no experimental results were presented. The workbench in [7] aimed at reconfiguration issues only. It did not consider fault diagnosis issues and adopted a concurrent process hardware description language (CPHDL) for reconfigurable architectures. However, it did provide a translator to allow direct translation of the generated architectures into VHDL descriptions. The CAD systems in [8]- [lo] are not specifically for defect-tolerant arrays, while the CAD system in [ 111 is for the conceptual design of VLSI systems. The VAR system presented in this paper can fill this void. Kung 1121 presented an array compiler for CAD of array processors which can be divided into three levels: the array level, the processor level, and the realization level. The VAR system focuses on design and analysis issues at the processor level.
The paper is organized in the following manner. An overview of the VAR system is first presented in Section 11. In Section 111, a PE-switch lattice model and its implication are first presented. Then VHDL is used to model an example M X N defect-tolerant array. The reconfiguration process and the experimental results are detailed in Section IV. The evaluation process and simulation results are depicted in Section V, and followed with some concluding remarks.
OVERVIEW OF THE VAR SYSTEM
This section overviews the VAR's organization and problem-solving approach, The system configuration of the VAR system is shown in Fig. 1 . The module of array design with VHDL will be described in Section 111. The functions of other major modules are specified in the following subsections.
A . Kernel and VHDL Support Environment (K& VSE)
K&VSE supervises, schedules, and coordinates the tasks in VAR. In Fig. 1 a dashed line between K&VSE and a module implies that there is an interaction involving commands and responses between these two modules. The Intermetrics Standard VHDL 1076 Support Environment is the basis of VSE.
B. Fault Diagnosis Module (FDM)
In a reconfigurable system, replaceable faulty elements must first be identified before a repair action can be taken. FDM will execute a fault diagnosis algorithm specific to an architecture. The fault diagnosis algorithms in [13] are used here and the trade-offs can be compared via simulation.
C. Reconjiguration Module (RM)
RM coordinates the following three events: execution of a reconfiguration program, execution of a switching mechanism transformation program, and VHDL simulation of target array generation and functionality verifica- tion. The reconfiguration algorithms in [14] are employed. Depending on the switching mechanisms, the algorithms reconfigure an array under three constraints:
(1) row and column bypass (algorithm BB), (2) row bypass and column rerouting (algorithm BR), and (3) row and column rerouting (algorithm RR). The efficiency of the algorithms and the complexity of the switching mechanism transformation can be observed through simulation.
D. Evaluation Module (EM)
EM evaluates redundancy overhead, time complexity of fault diagnosis and reconfiguration algorithms, fault coverage, yield, delay, performance, and cost. The parameters that may affect yield are area overhead, diagnosability , reconfigurability , and the efficiency of diagnosis and reconfiguration algorithms. Overhead arising from diagnosis and reconfiguration results in long wire delay and performance penalty. The correlation of these parameters is studied by EM. Variations in the design using different fault diagnosis algorithms, reconfiguration algorithms, and reconfigurable architectures are also evaluated based on the above parameters.
E. Optimization Module (OM)
Based on the evaluation results, OM modifies an architecture to enhance yield, performance, overhead reduction, and diagnosis and reconfiguration efficiency. The optimized architecture needs to be evaluated by EM again. This procedure may need to be iterated several times before a cost-effective solution is obtained.
F. Synthesis Module (SM)
The ultimate goal of the VAR system is to generate a complete layout of a reconfigurable array directly from its VHDL description. SM can interface with a VHDL-based synthesis tool to translate the array from the behavioral description to a structural description and then from the structural description into a complete layout.
111. MODELING DEFECT-TOLERANT ARRAY USING VHDL A systolic array will be used as an example to illustrate the modeling process for defect-tolerant arrays using VHDL. Descriptions of the array model, system design, PE design, and switch design and a comparison with an existing modeling method are detailed in the following subsections.
A . A PE-Switch Lattice Model and Its Implication
A general PE-switch lattice model is shown in Fig. 2 Given the array model ( M , N , Th, T,,, S, U , V ) , a nondefect-tolerant array is a U X V array with no redundant elements such as extra PE's, switches, or links; i.e., it is the original architecture for a particular application. A host array is an M X N reconfigurable PE-switch lattice array which may contain faulty elements after manufacturing. A logic array is a host array with the faculty switches and links switched out using the method in [ 131.
Note that a logic array may still contain faculty PE's. A target array is a U x V reconfigurable PE-switch lattice array that contains no faulty elements and is obtained by replacing faulty elements through reconfiguration. Althoug a non-defect-tolerant array and the corresponding target array are functionally equivalent, the performance of the latter might be worse than the former owing to the + N ) for a type-2 array. A characteristic matrix characterizes the status of a host array before configuration and the status of a target array after reconfiguration. It can be decomposed into a plain-switch matrix, a mix-switch matrix, and a PE matrix. The classification used in the characteristic matrix is shown in Fig. 3 , where each leaf node is a possible state of a PE or switch. The classification concept is related to the work by Pradhan [16] . and links, while a PE track is a PE-switch block which consists of PE's as well as switches and links. In Fig. 4 , only the leaf nodes are modeled with VHDL behavioral descriptions and all higher level nodes are modeled with VHDL structural descriptions. The array is described in a generic way. The host and target array dimensions (M X N and U X V ) , the numbers of horizontal and vertical link tracks (Th, Tu), the array type ( S ) , the clock cycle times of PE's, switches, and control lines (PE-cycle--time, SW-cycle-time, Ctr-cycle-time), and the delays of subcomponents (Multiplier-delay, Adder-delay, etc.) are described by a VHDL generic statement to allow the designers to specify these parameters and, hence, to facilitate design flexibility.
B. System Design
In order to simulate such a reconfigurable array, a VHDL test bench environment is created to facilitate test and simulation. Fig. 5 demonstrates the VHDL test bench environment for reconfigurable arrays. The VHDL test bench includes two components: the generator (Systolic-gen) and the reactor (Systolic-array) . When the VHDL test bench is simulated, Systolic-gen will provide stimulus to Systolic-array, and Systolic-array will respond to this stimulus and will send results back to Systolic-gen [ 181. By using the test bench environment, we can experiment with different design alternatives of Systolic-array and various types of Systolic-gen. For instance, this environment can be used by FDM for the simulation of the fault diagnosis process. In this case, the Systolic-gen supplies the test patterns, while the Systolic-array responds with the test results. In Section IV, the test bench environment will be used by RM to simulate the generation and the functionality verification of the target array. The VHDL description of the test bench environment is depicted as follows. Fig. 6 shows the VHDL entity declaration with a generic statement of the test bench environment. The generic statement includes the above parameters with default values, which can also be supplied during simulation without reanalyzing the VHDL codes. For a port such as a switch port with multiple sources, the port is defined as a resolved signal and has an associated resolution function to determine the value of the resolved signal. Note that those time-related parameters have the data type of Positive or Natural. This is because the VHDL simulator restricts the generic parameters to be of the basic data types only. Therefore, these parameters need to be converted into the data type of Time with the time unit ns during component instantiation. The component Systolic-array is the VHDL description of the example array, which is hierarchical and modular in the manner described by Fig. 4 . Various component design options can be easily checked by experiment. The component Systolic-gen is used to 1) generate the PE, switch, and control checks; 2) read and transmit the following data to Systolic-array: PE and switch control data and input data for multiplication; and 3) receive the multiplication results from Systolic-array . The above events are synchronized in Systolic-gen by adapting the synchronization mechanism to the host arrays and target arrays of various sizes, and different clock cycle times. Note that all the nonbasic data types are defined in the package declaration to facilitate resource sharing and modification. Since this is an architecture-level design system, a single delay time is used to capture all possible inherent delay inside a subcomponent, such as propagation delay and capacitive charging and discharging delay.
C. PE Design
Two types of PE's (PE, and PE,) are designed to illustrate how to incorporate and evaluate different component implementations. The structure of PE, is shown in Fig. 7 It is the clock for a data register when the corresponding PE is bypassed. A data register functions as a delay register when the corresponding PE is being bypased. The other PE design (PE,) has a configuration similar to that of PE, with the addition of a bypass link to each data path. PEb needs only one PE clock PEclk for each data register. The operation of each register in either PE is triggered by the falling edge of a singlephase clock. The single-phase clock scheme is adopted to focus on system design issues. Although MOS circuits are typically driven by a two-phase clock to avoid the clock skew problem, it will always be possible to directly translate the resulting circuits to corresponding two-phase MOS implementation [ 191.
D. Switch Design
The switch structure is shown in Fig. 8 [ 131. It has four I/O ports (SW,, SW,, SW,, and SW,) and consists of a switch communication box (SWComm) and a 4-bit switch control register (SWreg). The switch control register has a clock line SW,,,, a control line SW,,,, and a control register input (output) line CRi, (CROut). When SW,,, = 0, the switch control registers in the same column of the array form a scan path through CRi, and CR,,, which can scan in the switch control data. The switch communication box has three types of connection patterns with four possible states for each type. That is, there are 12 possible states SC,,,) of the switch control register. The switch design is more efficient than the design in [20] in terms of the reconfiguration time of switches and design complexity.
E. Comparison with an Existing Modeling Method
The object-oriented design of reliable/reconfiguration architectures (OODRA) workbench was targeted at the design and analysis of concurrent message-passing-based, parallel reconfigurable architectures [7] . It used a concurrent process model for the description of application specific reconfigurable parallel architectures. This concurrent process model has been encapsulated in CPHDL. Although VHDL and CPHDL both have the same features that can describe concurrency and structural reconfiguration, CPHDL was chosen for OODRA instead of VHDL. They argue that VHDL is too verbose to be easily captured in a simulation model targeted at application specific parallel architectures [7] . However, eventually they still need a translator to convert the CPHDL descriptions into the VHDL descriptions that might face the same problem in the translator design. Based on our experience and the fact of VHDL being an IEEE standard, this problem can be overcome. When we talk about tool support and the interface to other tools, VHDL is the choice for the description of reconfigurable architectures. Other existing models of hardware execution and their associated hardware description languages are not targeted toward reconfigurable architectures, and are reviewed in [7] . Now we will compare the differences of using CPHDL in OODRA and VHDL in VAR to describe reconfigurable architectures. There are three basic elements in a CPHDL-based reconfigurable architecture description: process abstractions, switch abstractions, and link abstractions [7] . A reconfigurable architecture is composed of multiple processors implementing processes interconnected by links and reconfigured by switch mechanisms
[7]. The VHDL description of a reconfigurable architecture is based on the hierarchical structure in Fig. 4 . One extra level of hierarchy, which consists of the switch block and the PE-switch block, is added, so we deal only with one dimension of interconnections instead of two dimensions of interconnections, as in the CPHDL-based architectural description. This has the advantage of making architectural description concise and clear. In OODRA, both the host array and the non-defect-tolerant array need to be described by CPHDL. The description of the non-defecttolerant array is used as a template in order to map the host array with multiple faults into a working system (target array) [7] . The link and channel structures are static, i.e., once instantiated. they cannot be physically rerouted. In VAR, only the host array is described in VHDL. The target array is obtained through actual reconfiguration by using the PE and switch control data to convert the host array into the target array. The interconnection patterns are dynamic in our approach; i.e., they can be rerouted by using different PE and switch control data. Therefore, by describing the host array in VHDL, we can reconstruct and simulate various target arrays based on different fault patterns. The above discussion outlines the basic differences of the modeling methods between CPHDL in OODRA and VHDL in VAR. This comparison demonstrates the effectiveness and flexibility of the VHDL modeling method in VAR.
IV. RECONFIGURATION PROCESS AND SIMULATION
Only host arrays with PE faults will be considered, which is usually assumed in the literature [3], [4], in the simulation and evaluation processes. The effect of faulty switches and links was addressed in [13], where the faulty switches and links can be identified. During reconfiguration, each faulty switch or link is lumped into one of the adjacent PE's and that PE is considered faulty. One study [2 11 showed that the following yield statistics are typical: 30-65% for PE's, 99% for switches, and 95% for links (or wires). Since the yield of switches and links are high, it is appropriate to use host arrays with faulty PE's only to illustrate the simulation and evaluation processes.
A. Reconjiguration Process
The reconfiguration process in Fig. 9 includes three stages: 1) execution of the reconfiguration program, 2) execution of the switching mechanism transformation program, and 3) generation of the VHDL target array. The reconfiguration algorithms described in [ 141 are used to generate a logic matrix based on the error matrix for a host array. The transformation program takes the logic matrix as the input and transforms it into a characteristic matrix according to the selected switching mechanism. The characteristic matrix which contains switch settings and PE bypassing information can then be used as the control input to reconfigure the VHDL-based host array into the target array. The correctness of the reconfiguration process is verified through VHDL simulation. The transformation program consists of two parts: the transformation algorithm and the text-to-binary conversion program. The transformation algorithm generates the characteristic matrix of a target array based on an error matrix and a logic matrix. The three output matrices, the plain-switch matrix, the mix-switch matrix, and the PE matrix, are derived from the characteristic matrix. These output matrices contain the switch settings for the switches in the vertical link tracks and the switches in the vertical PE tracks, together with the PE bypassing information. These matrices need to be converted into binary before they can be used as the input for VHDL simulation to reconfigure the VHDL-based host array.
B. Simulation Setup
The reconfiguration program and the transformation program are implemented in C on a Sun 31260 workstation. The conversion of an example 6 X 6 type-2 host array into a 5 X 4 target array is used to illustrate the reconfiguration process. Fig. 10 shows the graphical representation of the reconfiguration process for the example array, where only active links are retained for clarity. After the first two stages of the reconfiguration process are executed, the Intermetrics Standard VHDL 1076 Support Environment is used to simulate the third stage of the reconfiguration process and matrix multiplications. To demonstrate how to validate the reconfiguration process and the functionality of the target array, an example matrix multiplication is performed. The default values of the generic parameters (as shown in Fig. 6 ) are used unless they are particularly mentioned. The VHDL simulation process includes the third stage of the reconfiguration process and a matrix multiplication, and is divided into five parts: 1) transmitting the plain-switch matrix and the mixswitch matrix from the north side, and the PE matrix (which initially contains the control data to bypass all the PE's) from the west side of the array to proceed reconfiguration; 2) transmitting the multiplicated matrix [ W ] from the west side of the array via the X input data paths to the array and storing each element of the matrix in the W data register of the corresponding PE; 3) transmitting the PE matrix to reconfigure the host array into the target array; 4) transmitting the zeros from the west side of the array via the X input data paths and the multiplier matrix [ Y ] from the north side of the array via the Y input data paths to the array, and starting multiplication; and 5 ) collecting the multiplication product [XI from the east side of the array via the X output data paths. An input format conversion program is necessary to convert the multiplicand and multiplier matrices into the appropriate formats, and an output format conversion program is also needed to convert the binary output data into integers and in the form of a matrix.
C. Experimental Results
Both the PE,-based array and the PEb-based array are simulated. It is found that the PE,-based target array needs to operate at PE-cycle-time of 400 ns, while the PEb-based target array can operate at PE-cycle-time of 200 ns. This is because the delay caused by the bypass registers in the PE,-based target array is larger than that of the bypass links in the PEb-based target array. One way to improve the PE-cycle-time of PE, is by adopting only one PE clock and adding some delay registers on certain data paths, as suggested in [ 2 2 ] . The simulation results compiled from the output report of VHDL report generator for the PEb-based array are shown in Table I [12] . Notice that the performance measure for the PEb-based array is done at the processor level while for the Warp processor array and the Hadamard transform chip it is done at the realization level. To measure the efficiency of the VAR environment, the run times of the first two stages of the reconfiguration process and the VHDL simulation process for the PE,-based and PEb-based arrays are shown in Table 11 . Note that the third stage of the reconfiguration process is part of the VHDL simulation process. It is observed that the total run time is dominated by the reconfiguration time and the transformation time. The reconfiguration time and the transformation time for larger arrays are further discussed in the next section. The reason why we have a fast VHDL simulation process has to do with the simple designs of PE and switch control lines, which reduce the actual recon- To evaluate the efficiency of redundancy and reconfiguration algorithms as well as the quality of target arrays, processor arrays with various sizes are simulated and evaluated using these figures of merit. Simulation results for both random and clustered faults are discussed in subsections B and C respectively.
A. Figures of Merit
The evaluation criteria for a reconfigurable architecture are based on the following figures of merit: survival probability (S,), locality (LA), maximal interconnection length array. The larger LA is, the longer the average wire length of an interconnection link will be. It also means more switches along an interconnection link. As a result, a determined by MIL, LA can tell us the quality of a target
(3) The maximal interconnection length (MIL) of a target array ( M , N , Th, Tu, S, U , V ) is defined as follows: (ILi,j,k,) ) (4) larger LA implies a higher failure rate of an interconnec-
where (ri, cj) and (rk, cl) are the indices of the augmented logic matrix for any two adjacent PE ( i , J ) and PE ( k , I ) (based on the degradation approach) (6) or 0 -sn
(based on the redundancy approach) (7) where P is the manufacturing cost ratio of a PE to a switch, S,, is the number of switches, and
is the number of extra PE's, with R ( C ) being the number of extra PE rows (columns). Equations (6) and (7) are equivalent if UV = MN -P. To attain the value of oh, the number of switches used in the array should be determined first. The number of switches in a type-1 reconfigurable array architecture is
and that in a type-2 reconfigurable array architecture is
The area overhead (0,) of a PE-switched lattice array is derived as follows. Assuming the width of a switch to be wX, where X is the length unit, the area (Ao) of a nondefect-tolerant array ( U, V, 0, 0, S, U, V ) is equal to
where 6 is the width ratio of a PE to a switch. The area of PE-swith lattice array ( M , N, Th, T,, S, U , V ) is
(based on the type-1 array),
or
(based on the type-2 array).
Therefore, the area overhead is equal to where SpE = MN -UV is the number of extra PE's, (Y is the clustering parameter, D is the defect density, and SJk)
is the probability that the reconfiguration algorithm can reconfigure a faulty host array into the desired target array given k faulty PE's. The value of SJk) = S k / T can be attained by the Monte Carlo simulation, where Sk is the number of fault pattems with k faulty PE's that can be reconfigured successfully and Tis the total number of fault pattems generated. The manufacturing yield (Yo) of a nondefect-tolerant array (U, V , 0, 0, S , U, V ) obtained by setting S,, = 0 in Equation (14) is (15) Assume that the reliability (RPE(t)) of a PE is exponentially distributed with a failure rate APE, i.e., RpE(t) = e-APE'. The reliability (RA(t)) of a PE-switch lattice array
( M , N , Thy Tu, s, U, v, is
and the reliability (Ro(t)) of a non-defect-tolerant array (U, V, 0, 0, S , U, V ) is
B. Simulation with Random Fault Distribution
Three host arrays, (27, 27, 1, 1, 1, 25, 25) , (22, 22, 1, 1, 1, 20, 20), and (17, 17, 1, 1, 1, 15, 15) , are simulated by using the reconfiguration algorithm RR. Parts (a) and (b) of Fig. 11 show the relationship of reconfiguration time and transformation time with respect to fault sizes. The reconfiguration time is less than 0.2 s in most cases. The time needed by transformation is also small (10.2 s) and independent of fault sizes. The utilization is an important index for a reconfiguration algorithm based on the degradation approach. Fig. ll(c) shows the relationship between utilization and fault sizes. The utilization for these arrays is at least 80% with less than 30 faults. The utilization decreases if the fault size increases. Higher utilization usually results in higher survival probability. To study the effect of redundancy on the survival probability, the following host arrays are simulated by using algorithm RR: (21, 20, 1, 1, 1, 20, 20) , (21, 21, 1, 1, 1, 20, 20) , (22, 21, 1, 1, 1, 20, 20) , and (22, 22, 1, 1, 1, 20, 20) . That is, the number of extra rows/col- umn is in the range from 1 to 4. As expected, Fig. ll(d) shows that the larger the redundancy, the higher the survival probability. This figure can help designers determine the appropriate redundancy based on the requirement of s, and possible fault sizes based on manufacturing yield data. To determine if the amount of redundancy is apporpriate, several parameters need to be evaluated. Ta should be evenly distributed on rows and columns if the target array is a square array. Table I11 also shows the effect of 6 on yield. The yield decreases if 6 increases; that is, arrays with larger PE's result in a larger wafer area and thus tend to have lower yield.
C. Simulation with Clustered Fault Distribution
Manufacturing defect clustering occurs on a wafer [27], [28]. Therefore, in addition to evaluating the effect of random faults, we also study the effect of clustered faults on reconfigurable processor arrays. We apply the method in The generation of clustered faults is controlled by two parameters, cy1 and cy2, where cy1 is the probability of a PE being faulty at the initial fault generation stage and a2 is the clustering parameter. The clustered fault generator first generates faults with the probability cy1 for each PE. Based on this fault pattern, the generator converts a nonfaulty PE (i, j ) to a faulty PE according to the probability, cy1 + cy2 -adjacent (i, j ) , where adjacent (i, j ) is the number of faulty PE's that are adjacent to PE (i, j ) . Wraparound is assumed at the array boundaries to determine the value of adjacent (i, j ) . Therefore, a boundary PE at one side of the array is assumed to be adjacent to the boundary PE at the other side [30]. Compared with Table 111, Table IV shows the figures of merit for redundancy evaluation un- 
21
.00 (22, 22, 0, 0, 1, 20, 20) 0.19 1.05 2.06 10 21.00 (22, 22, 0, 0, 1, 20, 20) der the clustered fault distribution ( a , = 0.001) using algorithm RR. Tables 111 and IV show that the higher the redundancy, the larger the S,, LA, and MIL. We can see that the arrays under the clustered fault distribution with a2 = 0.01 have lower yield than the arrays under the random fault distribution. However, increasing the degree of clustering with a2 = 0.001 (i.e., a2 decreasing) tends to increase S, and thus enhance array yield. Table IV shows that S , increases if a2 decreases; that is, after increasing the degree of clustering to a point, arrays with a clustered fault distribution will have higher yield than arrays with a random fault distribution. This is why the clustered faults have to be considered. Therefore, inclusion of clustering in redundancy yield calculation is of considerable importance [ 2 8 ] .
To illustrate the effect of the numbers of link tracks and clustered faults as well as random faults on array yield by using the reconfiguration algorithms RR, BR, and BB respectively, the host arrays, ( 2 2 , 2 2 , 1, 1, 1, 20, 2 0 ) , ( 2 2 , 22, 1, 0 , 1, 20, 20) , and ( 2 2 , 22, 0, 0 , 1, 20, 2 0 ) , are simulated. Table V shows the simulation results, where a I is set to 0.001. A more complex switching mechanism results in larger S,, LA, and MIL, as demonstrated in Table V . Under the clustered fault distribution, if the degree of clustering is increased, the arrays for algorithms RR and BR have higher yield, but the arrays for algorithm BB have lower yield. Note that for the arrays using algorithm BB, their S , and LA stay the same even if a2 decreases. So algorithms RR and BR perform better for arrays with clustered faults. Algorithm BB has smaller overall yield because it does not have link tracks to provide row or column rerouting capability. One surprising result is that the arrays for algorithm BR have a yield that is comparable to that of the arrays for algorithm RR, although algorithm RR is more flexible. Arrays for algorithm RR have slightly higher S, but this is offset by the larger area owing to more switches than arrays for algorithm BR. This shows that it is not necessarily better to have more switches since the extra switches would increase the area without increasing the size of the target array. Therefore, for certain applications, one may want to choose the redundancy strategy in algorithm BR which has smaller LA, MIL, Oh, and 0,. Fig. 12 illustrates the relationship of reliability with respect to time for various reconfigurable arrays as well as the non-defect-tolerant array under the clustered fault distribution (a, = a2 = 0.001). The PE failure rate is assumed to be APE = 0.1 failures per unit time. Note that the array (22, 2 1, 1, 1, 1, 20, 20) has only a slightly higher reliability than the array (21, 21, 1, 1, 1, 20, 20) , although the former has an extra row. However, without redundancy the reliability of the non-defect-tolerant array will fall off rapidly.
The above simulation results demonstrate that by evaluating the figures of merit through simulation, we can choose a better combination of the redundancy scheme, switching mechanism, and diagnosis and reconfiguration algorithms to design a reconfigurable array based on actual manufacturing yield data.
VI. CONCLUSIONS
The integrated high-level CAD environment VAR for the design, diagnosis, reconfiguration, simulation, and evaluation of defect-tolerant VLSI or WSI array architectures has been presented. We have concentrated on the modeling, simulation, and evaluation processes for a defect-tolerant two-dimensional array in the VAR system. The simulation of the reconfiguration process in VAR is implemented by interfacing the reconfiguration program, the transformation program, and the VHDL description of the array. Extensive simulation has been performed and experimental results are obtained which indeed demonstrate the effectiveness of our approach. The VAR system will greatly help designers to evaluate different redundancy strategies, various fault diagnosis and reconfiguration algorithms, quality of target arrays, yield, and reliability. Future research issues include 1) simulation and evaluation of the fault diagnosis process; 2) optimization in terms of architectural designs, fault diagnosis techniques, and reconfiguration techniques; 3) interface to a VHDL synthesis system for the layout generation; and 4) extension to other reconfigurable parallel architectures.
