Abstract-Effects of radiation on electronic circuits used in extra-terrestrial applications and radiation prone environments need to be corrected. Since FPGAs offer flexibility, the effects of radiation on them need to be studied and robust methods of fault tolerance need to be devised. In this paper a new fault-tolerant design strategy has been presented. This strategy exploits the relation between changes in inputs and the expected change in output. Essentially, it predicts whether or not a change in the output is expected and thereby calculates the error. As a result this strategy reduces hardware and time redundancy required by existing strategies like Duplication with Comparison (DWC) and Triple Modular Redundancy (TMR). The design arising from this strategy has been simulated and its robustness to fault-injection has been verified. Simulations for a 16 bit multiplier show that the new design strategy performs better than the state-of-the-art on critical factors such as hardware redundancy, time redundancy and power consumption.
INTRODUCTION
T HE ever-increasing desire of man to explore the extra-terrestrial space around him, to study and use it for mankind's better future, has given rise to a remarkable development on the space technology front in the last five decades. The technical thinktanks behind these developments have often had to face extraordinary challenges. One of these challenges is the vulnerability of electronic circuits used in space systems to powerful space radiations [1] . Circuits used in space applications need to be protected from effects of such radiation. Closer home, radiation resistant circuitry is required for nuclear reactors and other radiation prone environments [25] .
Full-custom hardware design (also known as Application Specific Integrated Circuits -ASICs) and Semi-custom hardware design (also known as Field Programmable Gate Arrays -FPGAs) are the two types of hardware that can be used for radiation prone environments. Though ASICs offer the best performance for space applications, the complexity and cost involved are very high. Also, the functionality of ASICs is fixed and it cannot be altered. ASICs are very good against SEUs (Single Event Upsets) and they dominate the space technology market [4] , [21] . On the other hand, FPGAs are reconfigurable devices. The most commonly used FPGAs are SRAM-based [2] . They offer the flexibility of changing the functionality, and most importantly, functionality can be modified on-field during a mission too [3] , [21] . But, FPGAs are weak against SEUs [5] , [7] , [10] , [11] . In the last decade, studies have been done to correct such errors, which occur due to radiations on SRAM-based FPGAs [7] , [8] , [9] , [5] , [12] . These studies show that FPGAs have the potential to outperform ASICs in the space technology market. Thereby, it becomes necessary to study the effects of radiation on FPGAs and devise better fault-tolerant techniques. The known correction techniques depend on replication (dual,triple) to provide protection [18] , [19] , [22] , [20] (and the references therein). In this paper we present a technique that provides protection based on logic between the inputs and output(s) of a gatehence the name "Input-Output Logic Based (IOLB)" technique.
The remainder of the paper is organized is follows. Section 2 reviews radiation effects and existing techniques of fault-tolerance for SRAM based FPGAs. Section 3 presents the proposed approach to fault-tolerant design. Section 4 compares the proposed approach against the existing techniques. Section 5 concludes the paper.
REVIEW OF RADIATION EFFECTS ON SRAM-BASED FPGAS AND CURRENT TECHNIQUES
In this section we review the effects of radiation on SRAM-based FPGAs and the Current Techniques used arXiv:1311.0602v2 [cs.AR] 5 Nov 2013 Fig. 1 . SEUs -An Illustration to ameliorate these efffects.
Radiation Effects on SRAM-based FPGAs
Due to the effects of the earth's magnetic field and solar cycles, radiation doses change from location to location and time to time. So, a particular technology may be suitably adopted for a particular location and at a particular time [14] , [15] , [16] . Space radiation effects are due to High Energy Electromagnetic Radiation and Particle radiations. This implies that Electromagnetic waves of high frequency viz., X-rays, Gamma rays and fast moving subatomic particles viz., neutrons, protons will affect SRAM-based FPGAs [5] , [6] , [13] .
Broadly, the space radiation effects can be classified into Total Ionizing Dose (TID) and Single Event Upset (SEU). TID is the long-term damage done to the electronic circuitry due to electrons and protons. It is a permanent effect and it causes defects in the semiconductor lattice. Its effects are functional failures, leakage current and threshold shifts in CMOS [17] . SEU or soft error is a transient fault which may cause glitches i.e., current pulses, to propagate through the circuits. These glitches can be mitigated or corrected by design strategies. As shown in Fig. 1 , in an SEU, ionization radiation loses its energy when it strikes the silicon in an electronic device due to the production of free electron-hole pairs. Also, protons, neutrons and gamma rays can give rise to nuclear reactions when passing through a material. Ionization due to this generates large amounts of charge which is observed as a transient current pulse. Ohlsson et al. studied and analyzed the effect of neutrons in a Xilinx FPGA [11] . In [22] , Kastensmidt et al. point out that FPGAs are becoming more susceptible to neutrons as transistor size is decreasing and logic density increasing.
In SRAM-based FPGAs, customizable memory cells (SRAM cells), implement both the users combinational and sequential logic. When a SEU occurs in the combinational logic (synthesized in the FPGA), it corresponds to a bit flip in one of the LUTs cells or in the cells that control the routing. A SEU in an LUT memory cell modifies the implemented combinational logic while an upset in the routing can connect or disconnect a wire in the matrix. The configuration bitstream's next load corrects both these faults.
An SEU in the user sequential logic synthesized in the FPGA, has a transient effect because the next load corrects it. An SEU in the embedded block RAM has a permanent effect and fault tolerance techniques must correct it.
A fault-tolerant system for a SRAM-based FPGAs, must cope with the transient and permanent effects of an SEU in the combinational logic, short and open circuits in the design connections, and bit flips in the flip-flops and memory cells [22] .
Radiation tests on Xilinx FPGAs, for aerospace applications, have proven the need to use fault-tolerance schemes (for circuits) [23] . In [22] , it is argued that protecting FPGAs by the use of redundancy is a lot more cost-effective as opposed to designing a new FPGA matrix of fault-tolerant elements. They move on to propose a fault-tolerant technique that employs time and hardware redundancy.
In [18] , Wakerly attempted at applying Triple Modular Redundancy (TMR) concepts towards improving microcomputer reliability. In [19] , Carmichael applied TMR methodology to Virtex FPGA series. For highlevel SEU mitigation, the technique used most often today to protect designs synthesized in the Virtex architecture is based mainly on TMR combined with scrubbing [22] . In [20] , Kastensmidt et al. proposed two new schemes that reduced the redundancy from three in the TMR design to two. In this paper, we present the Input-Output Logic Based (IOLB) method that eliminates the need for even the existing dual redundancy.
Triple Modular Redundancy
In TMR method, each pin, wire and block are triplicated and a majority voting is done to determine the correct output. The basic idea is depicted in Fig. 2 . An illustration of the majority voter circuit is presented in Fig. 3 . The area overhead in TMR technique is more than 3 times that of the standard circuit. It does not correct all the upsets. The upsets will accumulate if there is no extra logic for the refreshing. So typically, scrubbing is done (scrubbing is the process of reprogramming the FPGA periodically to ensure that faults do not accumulate) [18] , [19] . Also, there are a variety of ways in which TMR can be applied to a circuit. In [20] , a comparison of the performance of these various ways is presented. Note that scrubbing lets a system repair SEUs in the configuration memory without disrupting operations (for correcting the voter logic). The scrubbing cycle time depends on the configuration clock frequency and the readback bitstream size [20] .
Overall, the TMR technique has limitations such as high area overhead, three times more input and output pins, and a significant increase in power dissipation.
Duplication with Comparison (DWC) and Time Redundancy
In this method, dual hardware and time redundancy concepts are used as presented in Fig. 4 for arriving at the quartet (Tc0, Hc, Tc1, Hcd). The two redundant blocks used are labeled as combinational logic 0 (cl0) and combinational logic 1 (cl1). If an upset occurs in cl0, then Tc0 and Hcd will be '1' and Tc1 and Hc will be '0'. Similarly, an upset in cl1 can be detected when Tc1 and Hcd are '1' and Tc0 and Hc are '0'. Using this information, a state machine is designed to perform the voting of fault-free block. But when Tc0 is 0, Tc1 is 1, Hc is 0 and Hcd is 1, there is no way to predict the faulty block. For this reason, DWC with time redundancy may fail to correct stuckat-zero and stuck-at-one faults as pointed in [20] . This led to the advent of DWC CED, which modifies time redundancy technique used in DWC with Time redundancy, to detect the permanent effect of an SEU.
DWC with Concurrent Error Detection
This method employs DWC along with Concurrent Error Detection (CED) to detect the location of the error. This method uses encode and decode functions (as [20] illustrated in Fig. 5 ) to re-compute the input operands. These functions are chosen in such a way that the output due to the re-computed operands differs from that of the original operands in the presence of an error. The voting circuit is presented in Fig. 6 . Say there is an error in block 0. Then, Tc0 will be 1 and Hc will also be 1. From the state diagram of the voter circuitry presented in [22] , it can be seen that it first enters the Upset Detection state as Hc is 1. From there, it enters the state dr1 is fault-free, since Tc0 is 1. It is worth noting that there must not be upsets in more than one redundant module, including the detection and voting circuits for faithful functioning.
In both TMR and DWC-CED , scrubbing corrects upsets in the user's combinational logic, and the CLB flip-flops TMR scheme corrects upsets in the users sequential logic. Scrubbing must be continuous to guarantee that only one upset has occurred between two reconfigurations in the design. As such, the scrubbing rate should be fast enough to avoid the accumulation of upsets in two different redundant blocks. Upsets in the detection and voting circuit don't interfere with the system's proper execution because the logic is duplicated and the logics latches are refreshed every clock cycle [22] .
THE PROPOSED TECHNIQUE
The motivation of the proposed Input-Output Logic Based (IOLB) method is to reduce on the dual hard- Fig. 6 . Voting the correct block [20] ware redundancy and time redundancy of the stateof-the-art DWC-CED scheme. The new strategy exploits the relation between changes in inputs and expected changes in output. Such an approach to predicting an expected output is novel and has not been reported hitherto. Our strategy does not require explicit duplication or triplication of the combinational logic block to detect and correct an error.
The proposed design methodology assesses if the output is expected to change following a change in the input(s). These signals (changes in inputs and change in output) are used along with appropriately designed logic to generate the error signal, which can then be XOR-ed with the output signal to yield the error-free output. Like in TMR and DWC-CED, scrubbing needs to be employed to correct upsets in combinational logic and TMR needs to be employed to correct upsets in sequential logic.
NOT Gate
We shall now consider the case of a NOT gate to illustrate the mechanism involved in designing the IOLB correction circuit. Say A is its input and B is its output. In an error-free scenario, if A is '1', then B would be '0'. However, say A is now changed to '0' (due to a SEU) and there is no accompanied change in B (that is, it stays at '0'). Then, that will mean that there is an error, since, in a NOT gate, a change in input is expected to bear a change in the output too.
Let us now examine a couple of other cases of input-output pair transitions. If the pair (format: 'AB') changes from '01' to '00', this means that the output change has occurred without there being a change in the input. This is again unexpected behavior for a normal NOT gate. Similar would have been the case, if the pair transitioned from '10' to '11'.
These above relations form the central idea to designing the error correcting circuit. Suppose A c is change in the input, B c is the change in the output and E is the error signal. We can arrive at the following Table 1 , in the first case, if there is no change in the input and the output (syndrome '00'), there is no error. If there is no change in the input but there is a change in the output (syndrome '01'), there is an error. Similarly, rest of the syndromes can be tracked. For the computation of changes in variables, we take the XOR of a variable with a delayed version of itself, thereby giving us '1' if there has been a change. The value of the delay is arbitrary. While trying to compute B c from B, we need to take care of the possibility of an error having occurred in the NOT gate. If we just perform XOR of delayed and current values of B, we will get an erroneous B c if the NOT gate is error-affected (since B would be erroneous). Hence, if the error signal E is '1' (indicating that the NOT gate is affected), we take XOR with the NOT of delayed B for the computation of B c . A block diagram of IOLB NOT gate is illustrated in Fig. 7 and the IOLB circuit for NOT gate is shown in Fig 8. 
Exclusive OR (XOR) Gate
We now consider the case of a two-input XOR gate. Say A and B are its inputs and S is its output. In an error free scenario, if both A and B are same, then S would be '0' and if both are different then S would be '1'. However, in case both A and B are same and S is '1' or in case both A and B are different and S is '0', that would mean there is an error. Such a case will arise when an expected output change does not succeed a change in an input/ when an unexpected output change occurs, even without a change in the inputs. We shall now look at specific instances of the above cases. Say the current state (format: 'ABS') is '000'. Suppose a transition occurs from '000' to '100', this is unexpected because, in an XOR gate, output is expected to change following a change in exactly one of the input. Suppose a transition occurs from '000' to '001', this is again unexpected, since in an XOR gate, the output cannot change without a change in its inputs.
Henceforth, we shall use c-subscripted symbols to denote changes in variables. For instance, X c denotes change in the variable X. In other words, X c ='1' means there has been a change in X and X c ='0' means there hasn't been a change in X.
In table 2, in the first case, there is no change in either of the inputs and there is no change in the output. Hence, the error (denoted as E) is zero. In the second case, there is no change in either of the inputs, but there is a change in the output. As has been discussed above, this is unexpected and hence there is an error (E = '1'). In the third case, exactly one of the inputs has changed and hence the output is expected to change. But it hasn't changed (since S c = '0') and hence there is an error ( E = '1'). Rest of the cases can be seen tracked from truth table in a similar fashion.
From Table 2 , it can be inferred that E = A c (xor) B c (xor) S c .
For arriving at changes in variables, we take the XOR of a variable and its delayed version. However, like in the case of NOT gate, the computation of S c is not straightforward. We resolve the problem in the same way as was done in the case of NOT gate: in the computation of S c , we use NOT of delayed S if E = '1'.
For producing the fault-free output, the error signal E is XOR-ed with S. A complete picture is illustrated One may think that in IOLB, for a single XOR gate, there is an addition of 5 XOR gates and 1 multiplexer. But it should be noted that TMR would have a six-fold increase (refer Fig. 11 ) in resources when looked at for a single XOR gate. TMR also requires triplication of inputs. For the DWC-CED method too, the resources required would be 5 XOR Gates + 4 Encode, 2 Decode Blocks + 4 multiplexers, 2 Flip-Flops and additional voter circuitry (refer Fig. 12 ).
General Procedure
Consider any logic gate with X 1 , X 2 , X 3 ...X n as inputs and Y as the output. Like earlier, let A c denote change in a variable A. If A c is 1, there has been a change in A and if A c is 0, there has been no change in A. We first obtain the changes in the input variables (X 1,c , X 2,c , X 3,c ...X n,c ) (labeled as change variables) by XOR-ing X 1 , X 2 , X 3 ...X n with their delayed versions. Likewise, we also obtain Y c from Y . Then, we analyze all possible cases of input-output transitions. There would be 2 2n+2 cases, because given an input state, there are 2 n possible transitions (since there are 'n' change variables) and the number of input states is itself 2 n (since there are 'n' input variables). Also, the output Y and its change variable Y c contribute to 2 2 states. The next step involves generating the error variable E, which indicates whether or not there is an error. We arrive at a truth table us- Fig. 11 . Error correction: TMR for a XOR gate Fig. 12 . DWC-CED for a XOR gate [20] ing X 1 , X 2 , X 3 ...X n , X 1,c , X 2,c , X 3,c ...X n,c , Y, Y c (labeled as error inputs) and estimate the error variable E in each of the 2 2n+2 cases. After the error variable E's column in the truth table is completed, E is expressed as a function of the error inputs. Sometimes, as witnessed in the above cases of NOT and XOR, the state where the transition began does not matter in generating E, only the transitions itself matter. So, we look for such potential redundancies and formulate an expression for the error variable E. The general expression can be presented as,
After the error variable E is obtained, the output Y is XOR-ed with E to generate the error-free output F .
COMPARISON
A 16-bit multiplier has been implemented using the IOLB strategy presented above. The IOLB circuits for AND and OR gates also have been arrived at, using the above strategy. Each of the logic gates in the cascaded multiplier were replaced with their full IOLB counterparts. We evaluated our method by testing it against fault-injection and the percentage of corrected faults was found to be 100%. Just like in [22] , we utilized 4x1 fault-injection multiplexers for emulating stuck-at-zero and stuck-at-one faults. A comparison Table 3 . As can be seen from Table 3 , the proposed design strategy requires lower hardware resources than do DWC-CED and TMR. Delay and power consumption factor could not be compared as we could not gather the clock frequency that was used in the study presented in [22] . However, we argue that our technique will perform better on delay, because there is no explicit time redundancy like in DWC-CED. Also, IOLB will do better at power consumption because of lesser requirement of both hardware resources and time.
In the event of an SEU, all the three schemes are error free. However, in TMR there is a possibility that error can occur in two or more replications. Similarly, in DWC-CED, there is a possibility that error occurs in both the replications while in our scheme error can occur at most in one module. While the scheme of this paper will correct possible errors, DWC and TMR will fail for two or more errors. This is elaborated in the following analysis.
In the Tables 4, 5 and 6, M i indicates the i th module. If M i is 0, then it is fault-free and if M i is 1, there is a fault in it. In TMR, a module is triplicated and a faithful functioning is expected if two of the three instances are fully free from error. In Table 4 , have identified eight possible scenarios for TMR, each of which could occur with equal probability. As can be seen from Table 4 , TMR works in 4 out of the 8 possible cases. This is because two of the three instances of the module should be fault-free for a faithful functioning. From Table 5 , it can be inferred that DWC-CED works in 3 out of the 4 possible cases. This is because DWC-CED works in all cases except when both the instances of the module (i.e., both M 1 and M 2 ) are affected.
We present a similar analysis in Table 6 for IOLB technique. The IOLB technique uses only one instance of the original module. If there is a fault in the module, then the IOLB circuit corrects it. So, in both cases, a faithful output is obtained. We thus conclude from our above analysis that IOLB is the least likely to be affected when compared with TMR and DWC-CED. A summary of the findings from the above analysis is presented in Table 7 .
CONCLUSION & FUTURE DIRECTIONS
In this paper, we have presented a new strategy for fault-tolerant design that performs better than the state-of-the-art. The proposed strategy essentially provides a mechanism to predict whether or not a change in the output is expected, as a function of changes in inputs. For this reason, the utility of this circuit is not only confined to fault-tolerant design but also relevant to applications that require information about whether or not changes in inputs would result in changes in output(s). Directions for future work include looking at other application domains where the proposed strategy can be applied to good effect.
