5 research outputs found

    Fine-grained module-based error recovery in FPGA-based TMR systems

    Full text link
    Space processing applications deployed on SRAM-based Field Programmable Gate Arrays (FPGAs) are vulnerable to radiation-induced Single Event Upsets (SEUs). Compared with the well-known SEU mitigation solution - Triple Modular Redundancy (TMR) with configuration memory scrubbing - TMR with modulebased error recovery (MER) is notably more energy efficient and responsive in repairing soft-errors in the system. Unfortunately, TMR-MER systems also need to resort to scrubbing when errors occur between subcomponents, such as in interconnection nets, which are not recovered by MER. This article addresses this problem by proposing a fine-grained module-based error recovery technique, which can localize and correct errors that classic MER fails to do without additional system hardware. We evaluate our proposal via fault-injection campaigns on three types of circuits implemented in Xilinx 7-Series devices. With respect to scrubbing, we observed reductions in the mean time to repair configuration memory errors of between 48.5% and 89.4%, while reductions in energy used recovering from configuration memory errors were estimated at between 77.4% and 96.1%. These improvements result in higher reliability for systems employing TMR with fine-grained reconfiguration than equivalent systems relying on scrubbing for configuration error recovery

    Fine-grained module-based error recovery in FPGA-based TMR systems

    No full text
    Space processing applications deployed on SRAM-based Field Programmable Gate Arrays (FPGAs) are vulnerable to radiation-induced Single Event Upsets (SEUs). Compared with the well-known SEU mitigation solution — Triple Modular Redundancy (TMR) with configuration memory scrubbing — TMR with module-based error recovery (MER) is notably more energy efficient and responsive in repairing soft-errors in the system. Unfortunately, TMR-MER systems also need to resort to scrubbing when errors occur in sub-components, such as nets, which are not recovered by MER. This paper addresses this problem by proposing a fine-grained module-based error recovery technique that without additional system hardware can localize and correct errors that classic MER fails to do. We evaluate our proposal via a fault-injection campaign on a Xilinx Artix-7 application circuit and compare the reliability, the error correction latency and the energy cost of repairing errors, of our proposal with those of a conventional MER approach and with periodic and on-demand blind scrubbing. We find the reliability of our proposal to be the highest and the energy expenditure to be the lowest amongst those methods considered.8 page(s

    High-level synthesis of triple modular redundant FPGA circuits with energy efficient error recovery mechanisms

    Full text link
    There is a growing interest in deploying commercial SRAM-based Field Programmable Gate Array (FPGA) circuits in space due to their low cost, reconfigurability, high logic capacity and rich I/O interfaces. However, their configuration memory (CM) is vulnerable to ionising radiation which raises the need for effective fault-tolerant design techniques. This thesis provides the following contributions to mitigate the negative effects of soft errors in SRAM FPGA circuits. Triple Modular Redundancy (TMR) with periodic CM scrubbing or Module-based CM error recovery (MER) are popular techniques for mitigating soft errors in FPGA circuits. However, this thesis shows that MER does not recover CM soft errors in logic instantiated outside the reconfigurable regions of TMR modules. To address this limitation, a hybrid error recovery mechanism, namely FMER, is proposed. FMER uses selective periodic scrubbing and MER to recover CM soft errors inside and outside the reconfigurable regions of TMR modules, respectively. Experimental results indicate that TMR circuits with FMER achieve higher dependability with less energy consumption than those using periodic scrubbing or MER alone. An imperative component of MER and FMER is the reconfiguration control network (RCN) that transfers the minority reports of TMR components, i.e., which, if any, TMR module needs recovery, to the FPGA's reconfiguration controller (RC). Although several reliable RCs have been proposed, a study of reliable RCNs has not been previously reported. This thesis fills this research gap, by proposing a technique that transfers the circuit's minority reports to the RC via the configuration-layer of the FPGA. This reduces the resource utilisation of the RCN and therefore its failure rate. Results show that the proposed RCN achieves higher reliability than alternative RCN architectures reported in the literature. The last contribution of this thesis is a high-level synthesis (HLS) tool, namely TLegUp, developed within the LegUp HLS framework. TLegUp triplicates Xilinx 7-series FPGA circuits during HLS rather than during the register-transfer level pre- or post-synthesis flow stage, as existing computer-aided design tools do. Results show that TLegUp can generate non-partitioned TMR circuits with 500x less soft error sensitivity than non-triplicated functional equivalent baseline circuits, while utilising 3-4x more resources and having 11% lower frequency
    corecore