Abstract
Introduction
With Very-Large-Scale Integration (VLSI) devices becoming more susceptible against soft errors, due to technology scaling and other effects such as charged radiation particles as external disturber, it is important to integrate fault tolerant mechanism into the system design for improving system reliability especially for safety-and mission-critical applications. From a practical point of view, the need for effective techniques providing the ability to detect the possible occurrence of soft errors during the operational phase is becoming a major issue.
On one hand, processor-based systems are commonly used in a wide variety of applications, including safety-critical and high availability missions, e.g., in the automotive, biomedical and aerospace domains. In these fields, an error may produce catastrophic consequences. Thus, dependability is a primary target that must be achieved taking into account tight constraints in terms of cost, performance, power ant time to market. Several solutions exist, acting either on hardware or software: however, they all have to face the high efforts required for designing, manufacturing, testing and qualifying processor-based systems. While standards and regulations (e.g., ISO-26262 [1] , DO-254 [2] , IEC-61508 [3]) clearly specify the targets to be achieved and the methods to prove their achievements. In this scenario, techniques working at system level (i.e., without changing the technology and the processor) are particularly attractive, especially if they can effectively meet dependability requirement more efficiently without changes in the existing hardware and software. Solutions based on additional modules that monitor the processor behavior and check its evolution looking for possible fault effects belong to this category.
On the other hand, Field Programmable Gate Array (FPGA) devices are becoming more and more attractive, also in safety-and mission-critical applications due to the high performance, low power consumption and the flexibility for reconfiguration they provide. 
Testing for SoC/SoPC by Exploiting the Debug Infrastructures
The soft errors in processor can be divided into two main categories non-exclusively: Data Errors and Control Flow Errors (CFEs). For Data Errors, there are already approaches for detecting and correcting, such as the Error Correction Coding (ECC) and redundancy-based software techniques. Also since a significant percentage of errors in a processor usually manifest themselves as CFEs, altering the execution flow of the software which can lead to system misbehavior, the first focus of this work is online test of CFEs.
Background
Traditional techniques such as Triple Module Redundancy (TMR) can be used to detect CFEs, however, such techniques introduce huge resource overhead which could be limited by the system constraints in terms of cost, power consumption etc.
Among the techniques that can be applied at system level, Control Flow Checking (CFC) is particularly effective. CFC consists in verifying the sequence of instructions executed by the target processor. Several CFC approaches are based on signature monitoring [4] : the program is divided into a set of blocks, named Basic Blocks (BB), having only one entry-point and only one exit point, hence whenever the entry-point instruction is executed the following instructions in the block are executed sequentially. Each BB has an associated signature that is calculated at compile time and stored in the system. During operational phase, a run-time signature is calculated and (at the end of the BB execution) compared with the reference signature, thus allowing to detect any error affecting the BB execution flow.
CFC Module for Online Detection of CFEs
A new method has been proposed during this work for the CFC approach resorting to available debug infrastructures to monitor the processor behavior. Debug infrastructures are intended to support software debugging in the embedded system development, and are very common in modern processors. Since they are useless when the operational phase in entered, they can be easily reused for online monitoring in an inexpensive way [5, 6, 7, 8] . On the other hand, they can provide internal access to the processor without disturbing it and do not require any modification either to the processor or to the software running on it.
The proposed approach introduced an external hardware module, named CFC module, to be attached to the target processor via debug interface to monitor the sequence of instructions executed by the processor, illustrated in Fig. 1 . The CFC module extracts the instructions executed by the processor, computes the signature by BB and at the end of the BB execution, it compares the online BB signature against the one pre-computed and stored in the Signature Table (CFC-ST) . Furthermore, a dynamic version of the CFC module has been proposed to handle the scenario where the number of BBs in the software is significantly larger than the size of the CFC-ST limited by hardware cost constraint, in which, the CFC-ST is able to update on-thefly. With replace policy similar to cache management for optimizing hit rate by taking advantage of the principle of locality [9] and the existence of cache-aware compiler optimization techniques [10] , the dynamic CFC module is cable to achieve a high error coverage even the CFC-ST is significantly smaller than the number of BBs in the software.
To verify the CFE detection capability of the proposed method, two processors were selected as target processors: miniMIPS [11] from OpenCores and LEON3 [12] . The miniMIPS' architecture is based on 32-bit registers and addresses, includes 5-stages pipeline and accounting for about 45K equivalent gates when synthesized (with multiplier) with the FreePDK45 Generic OpenCell Library from NanGate [13] . Please note that the original miniMIPS does not have the debug interface, one similar to LEON3's was added reporting the address and machine code of the executed instruction. The LEON3 processor is synthesized using the same synthesis design flow and library used for miniMIPS, the size of the LEON3 model used in the fault injection campaign is about 150K equivalent gates.
The CFC module was implemented in VHDL and synthesized using same cell library. The hardware costs of the dynamic CFC modules for miniMIPS and LEON3 are 800 and 2300 equivalent gates (excluding the cost of CFC-ST) respectively, corresponding to less than 2% and 1.4% of the total hardware size of the miniMIPS and LEON3 processor (the cost of static CFC module is even less).
1) Bubble: implements the Bubble Sort algorithm on a vector composed of 8 integer elements,
2) Matrix: computes the multiplication between two 3 by 3 integer matrices, 3) Dijkstra: implements the Dijkstra shortest path searching algorithm on a weighted graph with 9 nodes, 4) RLE: implments the Run Length Encoding and Decoding algorithm on a data set composed of 100 integers (of two different values), 5) MF: implements the Ford-Fulkerson algorithm which computes the maximum flow in a flow network (of 32 nodes connected with at least 64 random edges). Three CFE oriented fault models were used in the fault injection campaign, adopted from [14] : 1) Fault Model #1: a randomly chosen branch instruction is changed to a NOP instruction, 2) Fault Model #2: a randomly chosen bit in the Program Counter (PC) value is flipped at a random time,
3) Fault Model #3: a randomly chosen bit in the operand of a branch instruction is flipped at a random time.
Then fault injection campaign result is reported in Table II , where it can be observed that for the dynamic CFC module is capable to achieve a high error detect capability even when the size of CFC-ST is significantly smaller than the number of BBs in the software. Please note that the data for Bubble and Matrix benchmarks are not available since the numbers of BBs are already quite small in which case a small CFC-ST is sufficient to store all the signatures so that the static CFC module is able to detect all the CFEs for these two benchmarks. 
Hybrid Nonintrusive Error Detection Technique
In order to achieve full soft error detection capability, including Data Errors, a hybrid solution has been proposed combining the hardware monitoring via debug interface and the software-based technique for detecting Data Errors.
Two observation points are used: the memory bus between the instruction memory and the processor (instruction input stream) and the debug interface (instruction output stream). Information from the two observation points are carefully synchronized by the Hardware Monitor (HM) and compared to detect CFEs as illustrated in Fig. 2 . 1) total data-flow duplication: all the software data is duplicated and compared when a write operation is performed. When a discrepancy between the two data flows, an error is detected. To reduce the performance penalty as well as the code size, the checking points of the code were reduced as proposed by [15] . A similar approach can be found in [5] where the number of checks is varied depending on system requirements;
2) inverted branches [5] : re-evaluates branch conditions in two locations. When the branch is taken, the branch instruction is repeated with an inverted condition. Otherwise, the branch insutrction is simply repeated. If the repeated evaluation does not preduce the same resutl, an error is detected.
Fault injection campaign has been carried out using the AMUSE [16, 17] 
Analysis and Mitigation of Single Event Effects on FPGA Devices
Depending on the technology the configuration memory of the FPGA is manufactured with, the FPGA devices commonly used in the market can be divided into SRAMbased FPGA and Flash-based FPGA. When radiation induced Single Event Effects (SEEs) are concerned, the analysis and mitigation solutions for these two types of FPGA devices are different considering their different characteristics.
Single Event Upset in Configuration Memory of SRAM-based FPGA
The SRAM cells holding the configuration data of the circuit design is one of the most sensitive device against radiation effects. One-bit corruption in the configuration memory could alter the logic implemented and mapped on the device, causing system failure. So, for SRAM-based FPGA, the Single Event Upset (SEU) in the configuration memory is the major concern when deployed in harsh environment such as space missions.
There have already been solutions to cope with SEUs in configuration memory. Traditional redundancy-based solutions triplicate the circuit design so that the system can still work even if one copy is corrupted. Solutions based on scrubbing takes advantage of the fine control over the configuration memory frame access to fix accumulated SEUs by periodically refresh the configuration memory.
The proposed analysis and mitigation solution is able to: 1) produce error rate prediction of the target circuit design with the radiation environment profile; 2) generate placement constraints to implement a more reliable design without introducing any hardware overhead and in this way, the proposed solution is easy to be integrated with standard FPGA design flow as illustrated in Fig. 4 . Fig. 5 . The second experiment took a customized benchmark circuit B13 from ITC'99 benchmark collection [18] , and the experiment was carried out in Los Alamos Neutron Science Center (LANSCE), USA using a neutron bean of flux around 5.58×10 5 p/(cm 2 ·s) at energy level of 10MeV. On Xilinx Virtex 5 SRAM-based FPGA, for both experiments three different versions were prepared: 1) Plain: the original version of the benchmark circuit 2) XTMR: hardened version by Xilinx TMR tool [19] 3) XTMR-VP: hardened version based on XTMR version with proposed mitigation solution (using VERIPlace tool) The error rates, calculated the probability of an error at the output w.r.t. certain number of SEUs accumulated in the configuration memory, from two experiments are shown in the Fig. 6 . As can be observed from the figures, the version with proposed mitigation solution XTMR-VP has lowest error rate among the three versions. Please note that the proposed solution should be accompanied by other strategies to avoid SEU accumulation in the configuration memory such as scrubbing, where the error rate prediction of the proposed solution could be used as basis of scrubbing scheduling. 
Single Event Effects in Flash-based FPGA
Due to the non-volatile configuration memory, the Flashbased FPGA, different from the SRAM-based FPGA, is almost immune to SEU in the configuration memory. However, the floating gate based switches can still suffer from SETs, if hit by high energetic particles. Depending on the type of resources got hit, different phenomena could occur: 1) if the charged particle hits a memory element, such as Flip-Flop, it may direct corrupt the stored data, thus SEU occurs; 2) if the charged particle hits a combinational logic element or routing wire, it may induce a transient pulse, i.e. SET, which will traverse through the logic paths and may reach at the input of a storage element (or output pin). In this case, depending on the width of the pulse and sample time of the memory element, SEU or Single Event Multiple Upsets (SEMU) may occur.
Traditional TMR of memory elements is capable to handle the SEU in the first case, but not for the SET, as it can propagate to all three replicates of the logics. Yet the full triplication of the design could mitigation the SET but induces heavy overheads in terms of hardware, power. Meanwhile, previous research [20] found that when SET pulse traverse through the logic gates along the paths in the design, it may be broadened or filtered depending on the type of traversed gate, which is called Propagation Induced Pulse Broadening (PIPB) effects.
The analysis and mitigation flow for SEEs in Flash-based FPGA proposed in this work as illustrated in Fig. 7 consists of following steps: 1) SET Analyzer: takes the post-layout netlist and placement information exported from standard FPGA design tool and the Flash-based Technology Library, which contains the structural information and performance information of the target FPGA, performs the SET analysis generating the sensitivity against SET defined by user specified profile, along with the SET filtering/broadening effect for each logic gate;
2) Selective Guard-Gate (GG) Mapper: takes the FF and Gate Profile database generated by previous step, firstly insert a TMR structure with a majority voter for the FF marked as SEU sensitive and then selectively insert GG structure, as illustrated in Fig. 8 , into the design to filter SET according to threshold defined by user in order to tune the balance the trade-off between the reliability against the SET and the hardware and performance overhead;
3) SET-aware Place & Route: the developed place and route algorithm considers the SET filtering/broadening effects generated by the SET Analyzer in the first step and by carefully locate the logic gates according to rules for maximizing the SET filtering effects, for example, if two adajacent logic gates are inverting gates (gates generating inverted logic values at the output of input) then these two logic gates are placed closed to each other. The outcome generated from this step is the placement contraint file, which could be easily imported into the standard FPGA design tool to generate the final SET and SEU hardened implementation for the target FPGA. To verify the effectiveness of the proposed SET mitigation solution, a radiation test experiment has been carried out using Microsemi ProASIC3E Flash-based FPGA as Device Under Test. A RISC processor, namely RISC5X, from OpenCores was selected as benchmark circuit, using a simple sequence counter as software application. The original version of RISC5X is illustrated in Fig. 9 , including the Instruction Decoder (IDEC), Arithmetic Logic Unit (ALU), three 8-bit IO ports and a ROM component for holding the instructions. The radiation experiment took place in Universite Catholique de Louvain (UCL) using Kripton ion beam with fluence of 3.04e8 particles and average flux of 1e4 (particles/sec). Four versions of the RISC5X have been prepared: 1) Plain: the original version with RAM (which is actually a Register File) protected by ECC enable to correct two bits error when they reside separately in the higher and lower half of the 8-bit register. The ROM is implemented as logic gates instead of memory to avoid SEUs corrupting the instructuions;
2) TMR+GG(1ns): based on the Plain version with entity level TMR applied (of IDEC and ALU) and GG capable of filtering upto 1ns SET inserted;
3) TMR-FF: TMR applied to the Flip-Flops by the Synplify tool;
4) SEE-Aware Flow: based on Plain version with proposed mitigation flow.
The sequence counter generates a sequence of number at the output port of RISC5X continuously monitored and recorded by a separate board during the test. In case of a mismatch of data or time-gap between output data, an error is marked as detected and the DUT is reset and reprogrammed. With the data collected during the experiment, the error rate cross-section is plotted in Fig. 10 . As can be observed, the TMR+GG version and TMR-FF version have close error rate, both lower than the Plain version. However, the TMR+GG is the entity level TMR meaning more than 200% overhead. As for the SEE-Aware version, it has around 50% lower error rate compared to the TMR versions while introduces much lower overhead as can be observed in Table III . 
Conclusions
In this work, regarding the soft errors in electronic designs (focusing on processor-based system and FPGA devices), online error detection solutions have been proposed by exploiting the debug interface for increasing processor observability, results show the solutions can effectively detect the soft errors with low hardware overhead; analysis and mitigation solutions of SEEs in both SRAM-based and Flash-based FPGA devices have been proposed and verified by radiation test experiments that the proposed solutions could improve the design reliability against SEEs without introducing large hardware overhead.
5.

