Abstract-This paper presents robust VLSI architecture which avoids most of malfunctions and makes the system work correctly. The proposed architecture realizes robustness only by using small switches. The switches avoid the broken computing modules and reconfigure data flows between the other normal modules. This architecture has the advantages compared to conventional duplicated systems in terms of resource utilization and circuit area, and improves yield rate. We designed the Viterbi decoder based on the proposed robust architecture and evaluated its effectiveness in CMOS technology.
I. INTRODUCTION
Recently high-speed processing for complex systems has been demanded and, as one of good designs for such processing, the parallel/concurrent processing is utilized in VLSI design. The progress of the semiconductor process technologies makes it possible by integrating a huge number of transistors on a chip. It means that almost all systems can be designed by parallel/pipeline architectures with the larger number of gates. However, current LSI manufacturing struggles with various difficulties in silicon wafer manufacturing in order to maintain yield rate. Their problems, such as electro-migration, antenna effect, IR drop, cross-talk noise, and etc., prevent circuits working correctly, especially in DSM (Deep Sub-Micron) devices. As the number of transistors per chip increases, a loss caused by the breakdown of chips becomes more serious. However, the breakdown parts mostly remain a small percentage in a whole circuit. If these broken parts can be detected and avoided from normal parts, plenty of chips can almost restore operating functions. Accordingly, this concept as robustness helps to increase the yield rate and is to be useful for future SOC (System-On-Chip) design.
If we consider robustness towards such malfunctions, the fault tolerant system [1] with duplicated systems generally can be applied. However, it has the defects in view of resource utilization and power consumption because redundant parts is never utilized in case of no malfunctions. The proposed architecture requires no redundant modules. It can be realized by improving the switches used in dynamic reconfigurable architecture [2] , [3] . These switches can avoid malfunctions from normal modules and reconfigure data flows between the normal modules. The key point is that these switches are embedded into processing blocks with small granularity.
To verify the validity of the proposed architecture, we implemented it into the Viterbi decoder which is a good example of the robust system. A Viterbi decoder [4] - [6] is widely used in wireless communication systems such as wireless LAN, cellular phones and the ground-wave digital broadcasting.
II. CONCEPT OF THE ROBUST ARCHITECTURE
Robust architecture makes it possible to work a LSI including some malfunctions correctly. The difference between the proposed system and fault tolerant systems is that the proposed architecture has no duplicated modules. The duplicated modules are used as standby modules. Under normal conditions, i.e., no malfunctions, all redundant modules are only waiting for the requirement of use. It means that large resource redundancy exists. However when the proposed robust system has no malfunctions, all modules are effectively employed. When the proposed robust system has some malfunctions, the implemented switches control the data flows to avoid all fault modules and the system maintains its operating performance by coordinating clock speeds. Figure 1 shows an example of the robust architecture. The left figure shows a basic parallel processor and the right figure shows a robust parallel processor. Figure 2 shows an example of behaviors of the basic parallel processor and the robust parallel processor when they have some malfunctions at the same parts. In the left figure, the system is totally damaged since all parallel lines have malfunctions. On the other hand, the proposed system can work since some switches properly change data flows and they skip the fault modules. In this paper, the data path newly opened by switches in the system having malfunctions is called "Active parallel line" which is indicated by the arrows.
III. SPECIFICATION AND ARCHITECTURE OF THE SWITCH
Switch is the most important module for the robust architecture. The target performance of the robust architecture is given by
where N p is the number of parallels in the system, N s is the number of pipeline and E(x) is the number of fault modules at pipeline level x. N is the number of active parallel lines in the system when there are some malfunctions. Here, it is assumed that the switches have no malfunctions. To achieve the condition of Eq. (1), the specification of the switch is determined as follows. The number of input and output ports is 2N p − 1. Each port of the switch is named as shown in Fig. 3 . Here, n and N p satisfy the following equation;
The data flow is determined by the "State Signal". The data given to In K is put out from Out (n+K +1). The data given to In (n+K) is put out from Out (K +1). Here K is assumed from 1 to n. According to the "State Signal", the source of Out 0 is selected from In 0 to In 2n.
An example of the assignment of the "State Signal" and the selected source of Out 0 is shown in Table I . Here, n is assumed to be 6. When the Out 0 is derived from the left side input port selected from In 1 to In n, the LSB (Least Significant Bit) of the "State Signal" is 0. When the output port is derived from the right side input selected from In (n + 1) to In 2n, LSB of the "State Signal" is 1. When the data given to In 0 is put out from Out 0 in a straight line, all bits of the "State Signal" are 1.
The behavior of the switch is classified as follows: It has total four types of modes, which are No Malfunctions Mode (NMM), Checking Module Mode (CMM) and two types of EMM (Eliminating Malfunctions Mode) which consists of EMM1 and EMM2. EMM1 is used when the switches have no malfunctions and EMM2 is used when the switches have malfunctions. In 2 0011
In 8 0100
In 3 0101
In 9 0110
In 4 0111
In 10 1000
In 5 1001
In 11 1010
In 6 1011
In 12 Figure 4 shows the data flow in the whole system and the switch respectively when the system works on the CMM. The data given to In K goes to Out (n + K + 1). The data given to In (n + K) goes to Out (K + 1). Here K is assumed from 1 to n − 1. The data given to In 0 goes to Out 1 and the data given to In n goes to Out 0. In case of Table I , the state signal is 1010. When the system is working on EMM1, the data flow of the switch is not unique and it depends on the places and the number of malfunctions. Figure 5 shows an example of the data flow the switch respectively when the system works on the EMM1. To avoid these malfunctions, switches change the data flows. Figure 6 shows the data flow in the system working on the EMM2. When the switches have malfunctions, the circuit cannot be reconfigured. Therefore the parallel lines whose all modules are working correctly are the active parallel lines.
IV. TESTER FOR THE ROBUST ARCHITECTURE
To achieve the robust architecture, the tester is required. The tester searches modules having malfunctions in the system and generates the reconfiguration data. Since this system is checked at off-line, the tester is attached to the system. The tester consists of test pattern generators, sample modules, a comparator and a reconfiguration data generator. shows the robust system and its tester which is encircled by the broken line.
The behavior of the tester is explained as follows: The test pattern generator makes the test pattern and they are given to 'Sample Module1' and 'Module 1's (shown in Fig. 7 ). They are processed by each module and given to the 'Comparator'. When the data transferred from 'Module 1's to 'Comparator', the 'Buffer' is used as an interface. The comparator compares the data and the information of the fault modules are given to the 'Reconfiguration data generator'. In the same way, the other modules are checked and the 'Reconfiguration data generator' put out the reconfiguration data.
V. OPERATION FLOW OF THE ROBUST SYSTEM
The operation flow for the robust processing is executed to find the modules having malfunctions and reconfigure the circuit to avoid the malfunctions. At first, the whole system is checked whether it has malfunctions. The test patterns are put into test input ports. A test pattern to check the 'Module 1' (see Fig. 7 ) is given to the test input port. These data are given to each 'Module 1' and the output data from 'Module 1's are given to the 'Switch 1's. The buffer receives the data from 'Switch 1's and they are put out from test output ports. In the same way, the other modules are checked and information of the fault modules are given to the tester. According to the location information of the fault modules, the tester put the reconfiguration data out and the switches change the data flow to avoid malfunctions. If the system works correctly, it works on the EMM1. If the system doesn't work correctly, it means switches have malfunctions. When the switches have malfunctions, the circuit cannot be reconfigured. Therefore only the parallel lines having no malfunctions become active parallel lines. In such case, the system works on the EMM2.
VI. VITERBI DECODER AND EVALUATION
A block diagram of a high-speed Viterbi decoder is illustrated in Fig. 8 . The Viterbi decoder is constructed by an ACS (Add Compare Selector) module, a SVPM (Survivor Path Memory) module and a TBU (Trace Back Unit) module. The ACS selects the optimum path-metric, which indicates the minimum distance from the input data, and puts the survivor paths out. The SVPM memorizes the above path-metric information. The TBU executes the trace back processing. When higher throughput is demanded, the parallel processing can be adopted. This parallel Viterbi decoder realizes the n bit/clock. The result on logic synthesis of the Viterbi decoder is shown in Table II . This Viterbi decoder is designed by Verilog-HDL and the logic synthesis is executed by using the TSMC 0.25um standard cell library. Figure 9 shows the robust Viterbi decoder based on the proposed architecture. Some switches and a buffer are added to the high-speed Viterbi decoder and they can reconfigure the circuit. The size of each module composing the robust Viterbi decoder and the percentage of the additional modules and the critical modules are shown in Table III . To evaluate the robustness, computer simulations are conducted. In the simulation, the parallel system adopting robust architecture is assumed. Some parts of the systems are broken at random and the number of active parallel lines is estimated. The number of "active parallel lines" leads to operating performance in case of malfunctions. The number of trial is 1000 times and the result is a mean value. The robust systems have the critical modules which must not be broken. In case such modules have malfunctions, the number of active parallel line is estimated as zero. Figure 10 shows the number of faults in the system and the number of active parallel lines of high-speed Viterbi decoder. The conventional architecture, which duplicates the same parallel blocks in Viterbi decoder twice (total 2N p blocks in Fig. 8 ), loses active parallel lines when the number of malfunctions increases. The proposed architecture ensures more active parallel lines and the circuit area is almost half of the duplicated decoder. The robust architecture is expected to increase active parallel lines further when it is adopted to the system whose modules are large.
VII. CONCLUSION In this paper, the robust VLSI architecture was proposed. The feature of the architecture is small size of additional modules and a few critical modules. The robust system achieves high performance to avoid malfunctions. The switches in which several modes are employed are newly applied for the realization of the robust architecture.
