Abstract: Radiation and extreme temperature are the main inhibitors for the use of electronic devices in space applications. Radiation challenges the normal and stable operation of DC-DC converters, used as power supply for onboard systems in satellites and spacecrafts. In this situation, special design techniques known as radiation hardening or radiation tolerant designs have to be employed. In this work, a module level design approach for radiation hardening is addressed. A module in this sense is a constituent of a digital controller, which includes an analog to digital converter (ADC), a digital proportional-integral-derivative (PID) controller, and a digital pulse width modulator (DPWM). As a new Radiation Hardening by Design technique (RHBD), a four module redundancy technique is proposed and applied to the digital voltage mode controller driving a synchronous buck converter, which has been implemented as hardware-in-the-loop (HIL) simulation block in MATLAB/Simulink using Xilinx system generator based on the Zynq-7000 development board (ZYBO). The technique is compared, for reliability and hardware resources requirement, with triple modular redundancy (TMR), five modular redundancy (FMR) and the modified triplex-duplex architecture. Furthermore, radiation induced failures are emulated by switching all duplicated modules inputs to different signals, or to ground during simulation. The simulation results show that the proposed technique has 25% and 30%longer expected life compared to TMR and FMR techniques, respectively, and has the lowest hardware resource requirement compared to FMR and the modified triplex-duplex techniques.
Introduction
Outer space is full of radiation sources that include solar wind, solar flares, coronal mass ejections, galactic cosmic rays, Van Allen radiation belts, solar particle events, etc. This radiation environment consists of particles such as protons, electrons, neutrons, and heavy ions, [1] . The strike of any of these particles may compromise the normal operation of electronic circuits on board of space systems in this environment. Depending on the type and characteristics of the impinging radiation, different effects, either irreversible or (partially or totally) reversible, may arise. There are two major effects of radiation i.e., total ionizing dose (TID) and single event effect (SEE). TID also called cumulative effect, produce gradual changes in the operational parameters of the devices, which tends to degrade the characteristics of the devices overtime. SEE cause abrupt changes or transient behavior in circuits. Such effects, interfere with space systems' electronics operation, and, in some cases, threaten the survival of such systems. While TID effects reveal themselves gradually often after years of operation before a complete failure, SEEs don't. This work considers alleviating the effects of SEE on electronic circuits used for space applications.
Currently, the study of techniques to keep electronic circuits operational in such hostile environment has increased [2] , driven by the increasing number of applications of radiation tolerant circuits, such as space missions, satellites, high-energy physics experiments, etc. [3, 4] . This paper considers a module level approach for radiation hardening using fault tolerant method.
Fault tolerant methods use redundancy to mask or get around faults in electronic circuits. Redundancy is one of the most important methods to obtain highly reliable systems. Redundancy techniques have the ability to deliver continuous service in the presence of hardware faults by providing redundant hardware components. Redundancy techniques in general are adopting additional hardware components or additional computation time, which are used for fault detection or for fault masking so that the effect of faults is not reflected on the output signal [5] . The most common radiation mitigation techniques are TMR and FMR methods [6, 7] . They are highly-efficient but very costly and are used for situations where high reliability is targeted. Reliability is an important quality measure of a fault tolerant system.
Reliability is defined as the probability of not failing in a particular environment for a particular mission time. Suppose a system consists of N identical components. Let S(t) be the number of surviving components at time t, and Q(t) the number of components that failed up to time t. Then the probability of survival of the components also known as the reliability R(t), which is given by:
A measure of failure F(t) is defined as the conditional probability that the system fails by time t referred to us unreliability or failure time distribution:
Since S(t) + Q(t) = N, therefore:
Since F(t) is a probability, its derivative is a probability distribution function and defined as,
where f (t) shows the probability of failures per unit time. Now, the failure rate λ is defined as the number of failures per unit time, compared with the number of surviving components.
Failure rate =
The number o f f ailure per unit time The number o f surviving components or
Using Equation (3), the failure rate can be written as,
The expression may be integrated from 0 to time t, by considering at time t = 0, R(t) = 1, and at time t the reliability is R(t), then,
Often λ is assumed to be constant during the useful life of the system. Thus,
This gives,
The mean time to failure (MTTF) for the system is obtained as,
Assuming independent and identical modules having reliability of R m and with λ constant failure rate each, and then using the binomial theorem
The reliability of TMR is given as, R TMR = Probability o f all three modules are f unctioning +Probability o f any two modlues are f unctioning (13)
(15)
For the FMR method:
(19)
(20)
Motivation

The Base Architecture
The proposed method is derived from the architecture presented in [5] , which is called triplex-duplex redundancy. In this arrangement there are three primary modules using two duplicate modules each. Thus, a total of six identical modules are computing in parallel, which are grouped in three pairs. The computation result of each pair is compared using a comparator. If the results agree, the output of the comparator participates in the voting. If not, the pair of modules is declared faulty and the switch removes the pair from the system. The hardware resource requirement is 500% more compared to the simplex system and twice compared to that of TMR technique.
Modified Triplex-Duplex Architecture
The disadvantage of the triplex-duplex architecture is that it requires two times more hardware resources compared to the TMR method and has the one more module than the FMR method. Both modules in the duplex are removed from the voting as soon as one of the two modules in the duplex fails. This reduces the overall system mean time to failure (MTTF), if no repair is used. Therefore, except for faulty duplex detection, it is similar in operation to TMR.
To increase the reliability of this method, a modified architecture shown in Figure 1 was developed, where the comparator and switch parts are combined and modified in such a way that all duplexes are connected to all disagreement detectors and switch blocks, which allows for any module in the three duplex systems to act as an active spare for any other module in the three duplex systems. Therefore, the overall system will continue to work even if one module in all the three duplexes is failed, or even if only one duplex is left, or two duplexes with one good module each are left. This significantly increases the MTTF of the overall system and helps, if any repair or reconfiguration is used, to reduce the frequency of such repair or reconfiguration compared to TMR or FMR only methods. modules in the duplex fails. This reduces the overall system mean time to failure (MTTF), if no repair is used. Therefore, except for faulty duplex detection, it is similar in operation to TMR.
To increase the reliability of this method, a modified architecture shown in Figure 1 was developed, where the comparator and switch parts are combined and modified in such a way that all duplexes are connected to all disagreement detectors and switch blocks, which allows for any module in the three duplex systems to act as an active spare for any other module in the three duplex systems. Therefore, the overall system will continue to work even if one module in all the three duplexes is failed, or even if only one duplex is left, or two duplexes with one good module each are left. This significantly increases the MTTF of the overall system and helps, if any repair or reconfiguration is used, to reduce the frequency of such repair or reconfiguration compared to TMR or FMR only methods. This method uses 500% more hardware compared to the simplex system, the same as its base architecture, but with tremendous increase in reliability.
Assuming independent and identical modules having reliability of Rm and with λ constant failure rate each: 
There is 61%and 66% improvement in MTTF compared to TMR and FMR methods, respectively. This method uses 500% more hardware compared to the simplex system, the same as its base architecture, but with tremendous increase in reliability.
Assuming independent and identical modules having reliability of R m and with λ constant failure rate each:
(24)
There is 61% and 66% improvement in MTTF compared to TMR and FMR methods, respectively.
Proposed Four Modules Architecture
Besides having the best reliability and consequently MTTF, the disadvantage of the modified triplex-duplex architecture is its high hardware resource utilization. In effort to come up with high reliability and lower resource requirement redundancy, a four module architecture was developed as shown in the Figure 2 , which has the highest reliability compared to both TMR and FMR methods and lowest hardware resource requirement compared to FMR and the modified triplex-duplex methods. reliability and lower resource requirement redundancy, a four module architecture was developed as shown in the Figure 2 , which has the highest reliability compared to both TMR and FMR methods and lowest hardware resource requirement compared to FMR and the modified triplexduplex methods. The operation of this architecture is similar to the modified triplex-duplex architecture above, except that, there are four physical modules and two clone modules reducing the total number of actual duplicated modules to four instead of six. The clone modules were created as long as at least two of the physical modules were fault free, which in effect significantly reduces hardware resource utilization compared to the FMR and the modified triplex-duplex methods. The architecture masks the failure of two physical modules out of four.
The proposed four modules architecture is comparable, in terms of reliability, to the four modules highly reliable self-purging redundancy, [8, 9] . Self-purging redundancy uses a threshold voter instead of a majority voter. A threshold voter outputs a 1, if the number of its inputs that are 1 is greater than or equal to the threshold value; otherwise it outputs a 0. The idea of self-purging redundancy is that if only one module fails, then its output will be different from the others. A switch checks if a module's output differs from the output of a threshold voter. If it does differ, then the module is assumed to be faulty and its control flip-flop is reset to 0. This permanently masks the output of the module so that its input to the threshold voter will always be 0.
As pointed out in [8] , the self-purging method is not so much popular due to its complex threshold voter architecture. In case of the self-purging technique, faulty module detection is performed by comparing each module's output with the voted output. However, the detection of the faulty module is carried out before voting. In the case of the developed four modules method, it reduces the complexity encountered with a faulty voter especially when using multiple voters in the case of self-purging redundancy. Moreover, the proposed four-module redundancy technique can tolerate the simultaneous failure of two modules, whereas, a four module self-purging redundancy with a threshold of 2 cannot. Self-purging redundancy with a threshold of T can tolerate up-to T-1 simultaneous failures.
Assuming the same conditions as in previous cases for reliability calculation, The operation of this architecture is similar to the modified triplex-duplex architecture above, except that, there are four physical modules and two clone modules reducing the total number of actual duplicated modules to four instead of six. The clone modules were created as long as at least two of the physical modules were fault free, which in effect significantly reduces hardware resource utilization compared to the FMR and the modified triplex-duplex methods. The architecture masks the failure of two physical modules out of four.
Assuming the same conditions as in previous cases for reliability calculation, R Four−mod = B(4 : 4) + B(3 : 4) + B(2 : 4) (28)
(30)
There is 25% and 30% improvement in MTTF compared to TMR and FMR methods, respectively. The contributions of the developed methods are as follows:
• Authors proposed a highly reliability redundancy technique called the modified triplex-duplex redundancy, which has 61% and 66% longer expected life than TMR and FMR techniques, respectively, although its hardware utilization is the highest compared to both methods.
•
To rectify the hardware consumption drawback of the modified triplex-duplex technique, authors proposed a novel four module redundancy technique derived from the modified triplex-duplex method with the following advantages:
It is comparable in reliability to the four modules self-purging redundancy with threshold of 2 and to TMR with one spare with the additional advantages of tolerating simultaneous failure of two modules and reducing complexity, which both of the above two techniques lack. It gives 30% higher MTTF compared to FMR while utilizing lower hardware resources. It gives 25%higher MTTF compared to TMR method. Unlike self-purging redundancy that requires a specialized threshold voter, the proposed method is used with both single and triplicated majority voter architectures, since it is based on the modified triplex-duplex architecture.
Synchronous Buck Converter Controller Design
Closed-loop Control System
Figure 3 below shows a synchronous buck converter with its digital control feedback. It consists of four functional blocks: an ADC (analog-to-digital conversion), a compensator (error compensation), a DPWM (digital pulse-width modulation), and a synchronous buck converter power stage. There is 25% and 30% improvement in MTTF compared to TMR and FMR methods, respectively.
The contributions of the developed methods are as follows:
• Authors proposed a highly reliability redundancy technique called the modified triplexduplex redundancy, which has 61% and 66% longer expected life than TMR and FMR techniques, respectively, although its hardware utilization is the highest compared to both methods.
• To rectify the hardware consumption drawback of the modified triplex-duplex technique, authors proposed a novel four module redundancy technique derived from the modified triplex-duplex method with the following advantages:
o It is comparable in reliability to the four modules self-purging redundancy with threshold of 2 and to TMR with one spare with the additional advantages of tolerating simultaneous failure of two modules and reducing complexity, which both of the above two techniques lack.
o It gives 30% higher MTTF compared to FMR while utilizing lower hardware resources.
o It gives 25%higher MTTF compared to TMR method.
o Unlike self-purging redundancy that requires a specialized threshold voter, the proposed method is used with both single and triplicated majority voter architectures, since it is based on the modified triplex-duplex architecture.
Synchronous Buck Converter Controller Design
Closed-loop Control System
Figure 3 below shows a synchronous buck converter with its digital control feedback. It consists of four functional blocks: an ADC (analog-to-digital conversion), a compensator (error compensation), a DPWM (digital pulse-width modulation), and a synchronous buck converter power stage. In this circuit, the goal is to minimize the difference between Vref and Vo. Therefore, authors need to design a digital PID compensator to track the error and bring it down to as small as possible.
Digital PID Compensator Design
For control purposes, the block diagram of the buck converter, which is used in this work, is shown in Figure 4 . In this circuit, the goal is to minimize the difference between V ref and V o . Therefore, authors need to design a digital PID compensator to track the error and bring it down to as small as possible.
For control purposes, the block diagram of the buck converter, which is used in this work, is shown in Figure 4 . The main blocks are the duty cycle-to-output transfer function of the power stage or plant (Gvd), the compensator (H), the total time delay of the control loop, the DPWM gain (Kdpwm), the ADC gain (Kadc) and the output voltage sensor gain (Ksensor).
For a buck converter, the small signal control to output transfer function is given by [10] .
(32)
The design parameters considered are shown in Table 1 . The plant transfer function, including the effects of the ADC, DPWM and sensor is given by:
where tadc is the ADC conversion time and tdpwm is the DPWM delay time.
In Equation (34), the exponent term represents the total time delay, which is usually taken equal to the switching period. That is, . Then, the plant transfer function is given by:
The above transfer function presented in Equation (35) is used in the MATLAB control system toolbox to design the compensator in the analog domain. The designed compensator has a gain margin of 12.9dB and a phase margin of 66.7 degrees. Note that, the phase margin is intentionally made higher to compensate for phase margin loss when converting to the digital form. The compensator so designed is then converted to its equivalent digital form using the bilinear transformation. The final digital PID compensator transfer function is given by: The main blocks are the duty cycle-to-output transfer function of the power stage or plant (G vd ), the compensator (H), the total time delay of the control loop, the DPWM gain (K dpwm ), the ADC gain (K adc ) and the output voltage sensor gain (K sensor ).
where t adc is the ADC conversion time and t dpwm is the DPWM delay time.
In Equation (34), the exponent term represents the total time delay, which is usually taken equal to the switching period. That is, T s = t adc + dT s + t dpwm . Then, the plant transfer function is given by:
The above transfer function presented in Equation (35) is used in the MATLAB control system toolbox to design the compensator in the analog domain. The designed compensator has a gain margin of 12.9 dB and a phase margin of 66.7 degrees. Note that, the phase margin is intentionally made higher to compensate for phase margin loss when converting to the digital form. The compensator so designed is then converted to its equivalent digital form using the bilinear transformation. The final digital PID compensator transfer function is given by:
FPGA Implementation and Results Obtained
The digital PID compensator, an 8-bit sigma delta ADC and an 8-bit 1.5 MHz DPWM, as well as, all redundancy techniques have been implemented in MATLAB and Xilinx system generator. The overall objective is to properly regulate the output voltage towards the desired output voltage irrespective of the input voltage and any load variations within the given ranges and irrespective of radiation induced failure of any number of the duplicated modules based on the masking ability of the redundancy technique being used.
Hardware-in-the-Loop Simulation
It is practical to test the embedded controller more efficiently with a powerful method of hardware-in-the-loop (HIL) simulation. By thoroughly testing the controller in a virtual environment before proceeding to real-world tests of the complete system, one can maintain reliability and time requirements in a cost-effective manner. HIL simulation can also allow verifying whether the vendor specific FPGA synthesis tool actually retains the module level design, which is often not the case. Therefore, the HIL block is generated representing the radiation tolerant digital voltage mode controller for the synchronous buck converter.
The manual switches (S1, S2, S3, and S4), shown at the input of the controller HIL block diagram in Figure 5 are used to emulate the radiation faults during simulation; this is accomplished by switching the controller inputs to signals other than expected signals from the feedback system, or switching the inputs to ground (or, switch to zero). The duplicated voter's, Ref [11] error detectors (PIDErr1, PIDErr2, and PIDErr3) and the DPWM signals voter's error detectors (PWMErr1 and PWMErr2), shown at the output of the controller HIL block diagram in the Figure 5 can be used for repair/reconfiguration process initiation [12] [13] [14] , when radiation faults occur in the respective voters, if such systems are used. 
FPGA Implementation and Results Obtained
The digital PID compensator, an 8-bit sigma delta ADC and an 8-bit 1.5MHz DPWM, as well as, all redundancy techniques have been implemented in MATLAB and Xilinx system generator. The overall objective is to properly regulate the output voltage towards the desired output voltage irrespective of the input voltage and any load variations within the given ranges and irrespective of radiation induced failure of any number of the duplicated modules based on the masking ability of the redundancy technique being used.
Hardware-in-the-Loop Simulation
It is practical to test the embedded controller more efficiently with a powerful method of hardware-in-the-loop (HIL) simulation. By thoroughly testing the controller in a virtual environment before proceeding to real-world tests of the complete system, one can maintain reliability and time requirements in a cost-effective manner. HIL simulation can also allow verifying whether the vendor specific FPGA synthesis tool actually retains the module level design, which is often not the case. Therefore, the HIL block is generated representing the radiation tolerant digital voltage mode controller for the synchronous buck converter. The manual switches (S1, S2, S3, and S4), shown at the input of the controller HIL block diagram inFigure5 are used to emulate the radiation faults during simulation; this is accomplished by switching the controller inputs to signals other than expected signals from the feedback system, or switching the inputs to ground (or, switch to zero). The duplicated voter's error detectors (PIDErr1, PIDErr2, and PIDErr3) and the DPWM signals voter's error detectors (PWMErr1 and PWMErr2), shown at the output of the controller HIL block diagram in the Figure 5 can be used for 
Comparison of FPGA Resource Utilization and Reliability
As can be seen from Table 2 , the proposed four modules redundancy uses the lowest hardware resources compared to FMR and the modified triplex-duplex redundancies while having the highest reliability compared to TMR and FMR techniques as explained earlier. 
Conclusions
This paper presents a module level design approach to an FPGA based radiation tolerant digital voltage mode controller for a synchronous buck converter. A four-module high-reliability redundancy technique is proposed and implemented on zynq-7000 development board (Zybo). The technique has been compared with three other more common utilized redundancy techniques for reliability and FPGA resource utilization. It is observed that, the developed method has25% and 30% longer expected life than TMR and FMR techniques, respectively and requires lower FPGA resources compared to FMR and the modified triplex-duplex techniques.
It is shown that the proposed method can be used for radiation tolerant synchronous buck converter design for applications requiring relatively longer mission time, compared to TMR and FMR techniques. The work can be utilized in such applications where fault-masking ability of a system is required. For example space applications, power electronic converters applications, computers, satellites, high-energy physics experiments, etc. 
Conflicts of Interest:
The authors declare no conflict of interest.
