Abstract. The paper designed an improved type of multi-modular redundancy self-test and self-healing system. FPGA dynamic partially reconfigurable technology was used to repair faults in this system, so as to achieve the goal of improving system reliability. The system, hold fours similar MIPS processor modules, has been divided into two subsystems, with each contains two MIPS processors. The errors in system could be resolved by fixing the subsystem which has errors. This design could reduce the time of repairing system, prolong the system's life and raise the time without errors. Finally, the system was assessed by the Markov process, whose result show that the reliability of the system had been improved obviously.
Introduction
With the rapid development of semiconductor science, materials science, electrical engineering and computer science, limited hardware resources can not meet people's needs. Due to high radiation, high vibration, humidity, severe temperature changes and other unexpected events of special circumstances, the bit stream information in communications equipment of mainframe computers easily to be misplaced or flipped, equipment components tend to fail and other errors of various hard and soft occur. So the fault tolerance is also being challenged in the high-tech industry. Currently there are many institutions has made some progress and achievements in the field of reconfiguration: University of California proposed and developed M1 / M2 chip and Garp system; SIDNA project proposed a coarse-grained FPGA; Massachusetts Institute of Technology proposed DPGA; Xilinx company's proposed Virtex Series FPGA. The development of these results and product offers significant technical support for four modular redundant reconfigurable technology. But, most of these technologies have not combined with fault-tolerant technology. Integrating the dynamic reconfigurable of FPGA-based and fault-tolerant technology is relatively new direction of development. This design based on the technical principles of dynamic part reconfigurable FPGA, using four modules relatively fault-tolerant mode, designed and implemented an improved four-modular redundant self-test manually repair the system prototype. The technology can reconfigure autonomously the module error or faulty equipment parts in the absence of the presence completing independent repair to ensure the normal operation of the system, not only to ensure the safety and reliability of the system, while also reducing the resources waste.
Design and Implementation
(1) The architecture of four-mode redundant system is shown in Fig. 1 . Fig. 1 The architecture of four-mode redundant system
As shown in Fig. 1 , the main components of the system are: static logic, reconfigurable logic and acquisition module of results. Static logic includes: the instruction memory, the data memory, the error output module and a data output selection module. Reconfigurable logic includes: System A and System B two identical subsystems, each subsystem consists of two class MIPS modules and an analysis module.
(2) Main logic of four-modular redundancy system Among them, output error module has two functions being given: First, it can output Failure detection signal of module; Second, it can issue output selection control signal depending on the fault detection signal. Select output and output error module formed a system fault analysis and troubleshooting system structure.
Reconfigurable logic consists of two identical subsystems, each is a reconstruction unit. When an error occurs, you can only reconstruct a subsystem without having to rewrite the entire reconfigurable logic, which makes the reconstruction efficiency is improved. Subsystem includes two class MIPS processors without memory modules and analysis unit. Two class MIPS processors will process the input data, and, the output data generated by each module are outputted to the Analysis unit. The module analysis includes error detection unit and the output data generating unit. Error detection unit using the input data to analyze the errors for the subsystem, then, it will output the information which are analyzed to an external subsystem. Output data generating unit make the input data to be filtered, extracted and integration, and finally it can generate the correct data which are outputted to the external subsystem.
Brief introduction of class MIPS processor
Class MIPS architecture is shown in Fig. 2 . Single-cycle processor executes one instruction in one cycle. The structure is easy to explain, and control unit is simple. Because it finished in one cycle, so it does not require additional storage components. But, clock cycle is determined by the slowest instruction. Fault-tolerant system contains four identical class MIPS processing module, the structure of each class MIPS module is shown in Fig. 2 . Class MIPS module can be divided into three parts: Control Unit, Register Files, ALU. The class MIPS will become a MIPS processor When coupled with instruction and data memory.
Processor obtains instructions from the instruction memory, respectively, to the running of the input port of two classes MIPS module. Each instruction consists of several different fields. Control unit of each class MIPS module will calculate a control signal based on the instruction. Register files unit can be used to read and store data. ALU unit processing data which read from the register, you can add, subtract, shifting, logic and non-logic and other computing, the results are sent to the data memory or as an input address of the register file unit.
Class MIPS system uses 32-bit instructions, and it defines three instruction formats: R-type, Itype and J-type. Three instruction formats is shown in Table 1 . Table 1 format of MIPS instruction set R-type op (6) rs (5) rt (5 ) rd (5) shamt (5) funct(6) I-type op (6 ) rs (5) rt (5) imm(16) J-type op (6) addr (26) In which, "op" and "funct" are instruction opcodes to determine the operating and storage ALU unit. "rs" and "rd" are the source register, and "rd" is the destination register, "Shamt" only for shift operation. Immediate data of "imm" which directly can be used to calculate contains 16-bit. "addr" is a 26-bit operand of the address.
Reliability analysis of the system
In this paper, a Markov process can be used to analyze the reliability of repairable systems. Repairable system can repair the failure of the components of the system to working condition. General engineering systems are repairable systems. Due to a fault occurred in different parts, reason and degree, as well as different equipment maintenance and different levels of repair personnel, thus repair time is a random variable. Since repairable of the system, the research of reliability of repairable systems is more complex than non-repairable system. When we are researching repairable system, we mainly use a random process tool. When the repairing time of the fault component and component's life are exponential distribution, such a system can be used to describe the Markov process.
Reliability analysis of multi-mode redundant, we consider the Homogeneous Markov process which has continuous state and discrete time. For any state i and j, the following equation holds P {X (t + Δt) = j | X (t) = i} = P {X (Δt) = j | X (0) = i} = Pij (Δt) , From the state i to state j with only difference Δt time, whereas nothing to do with the starting time t. Pij (Δt) credited to the state i transition probability to state j, Matrix Pij (Δt) is called the transfer matrix. State transition matrix p(Δt) is shown in formula (1):
Transfer density matrix Q is shown in formula (2) (2) According to p (t), the probability of each state can be obtained at time t. The structure and function of four MIPS modules is exactly the same. Suppose a module at time the repair rate, this is the probability remodeling).
By Matlab simulation, you can obtain Influence of Reliability from different repair rate μ (reconstruction rate probability). Reliability analysis is shown in Table 2 . Table 2 From the data in Table 2 , with the improvement of reconstruction probability, system reliability becomes stronger.
Conclusions
In this paper, I have done some exploratory work about multi-mode redundancy system and control network reliability analysis. Unlike other designs, this design uses a dynamically reconfigurable way to achieve self-repair system, and long-term residual error module case does not appear, saving chip space, improving the utilization of the chip, ensuring the correct rate of the system. This design uses a Markov process to analyze the reliability of the system which is expected to result in XUPV5-LX110T development board to be verified.
