33 research outputs found

    An adaptive method to tolerate soft errors in SRAM-based FPGAs

    Get PDF
    AbstractIn this paper, we present an adaptive method that is a combination of SEU-avoidance in CAD flow and adaptive redundancy to tolerate soft error effects in SRAM-based FPGAs. This method is based on the modification of T-VPack and VPR tools. Three different steps of these tools are modified for SEU-awareness: (1) clustering, (2) placement and (3) routing. Then we use the unused resources as redundancy. We have investigated the effect of this method on several MCNC benchmarks. This investigation has been performed using three experiments: (1) SEU-awareness in clustering with redundancy, (2) SEU-awareness in clustering and placement with redundancy and (3) SEU-awareness in clustering, placement and routing with redundancy. With a confidence level of 95%, the results show that, using each of these three experiments, the system failure rate of ten MCNC circuits has been decreased between 4.52% and 10.42%, between 10.25% and 21.63%, and between 10.48% and 24.39%, respectively

    Using Fine Grain Approaches for highly reliable Design of FPGA-based Systems in Space

    Get PDF
    Nowadays using SRAM based FPGAs in space missions is increasingly considered due to their flexibility and reprogrammability. A challenge is the devices sensitivity to radiation effects that increased with modern architectures due to smaller CMOS structures. This work proposes fault tolerance methodologies, that are based on a fine grain view to modern reconfigurable architectures. The focus is on SEU mitigation challenges in SRAM based FPGAs which can result in crucial situations

    SINGLE EVENT UPSET DETECTION IN FIELD PROGRAMMABLE GATE ARRAYS

    Get PDF
    The high-radiation environment in space can lead to anomalies in normal satellite operation. A major cause of concern to spacecraft-designers is the single event upset (SEU). SEUs can result in deviations from expected component behavior and are capable of causing irreversible damage to hardware. In particular, Field Programmable Gate Arrays (FPGAs) are known to be highly susceptible to SEUs. Radiation-hardened versions of such devices are associated with an increase in power consumption and cost in addition to being technologically inferior when compared to contemporary commercial-off-the-shelf (COTS) parts. This thesis consequently aims at exploring the option of using COTS FPGAs in satellite payloads. A framework is developed, allowing the SEU susceptibility of such a device to be studied. SEU testing is carried out in a software-simulated fault environment using a set of Java classes called JBits. A radiation detector module, to measure the radiation backdrop of the device, is also envisioned as part of the final design implementation

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Characterization of Interconnection Delays in FPGAS Due to Single Event Upsets and Mitigation

    Get PDF
    RÉSUMÉ L’utilisation incessante de composants électroniques à géométrie toujours plus faible a engendré de nouveaux défis au fil des ans. Par exemple, des semi-conducteurs à mémoire et à microprocesseur plus avancés sont utilisés dans les systèmes avioniques qui présentent une susceptibilité importante aux phénomènes de rayonnement cosmique. L'une des principales implications des rayons cosmiques, observée principalement dans les satellites en orbite, est l'effet d'événements singuliers (SEE). Le rayonnement atmosphérique suscite plusieurs préoccupations concernant la sécurité et la fiabilité de l'équipement avionique, en particulier pour les systèmes qui impliquent des réseaux de portes programmables (FPGA). Les FPGA à base de cellules de mémoire statique (SRAM) présentent une solution attrayante pour mettre en oeuvre des systèmes complexes dans le domaine de l’avionique. Les expériences de rayonnement réalisées sur les FPGA ont dévoilé la vulnérabilité de ces dispositifs contre un type particulier de SEE, à savoir, les événements singuliers de changement d’état (SEU). Un SEU est considérée comme le changement de l'état d'un élément bistable (c'est-à-dire, un bit-flip) dû à l'effet d'un ion, d'un proton ou d’un neutron énergétique. Cet effet est non destructif et peut être corrigé en réécrivant la partie de la SRAM affectée. Les changements de délai (DC) potentiels dus aux SEU affectant la mémoire de configuration de routage ont été récemment confirmés. Un des objectifs de cette thèse consiste à caractériser plus précisément les DC dans les FPGA causés par les SEU. Les DC observés expérimentalement sont présentés et la modélisation au niveau circuit de ces DC est proposée. Les circuits impliqués dans la propagation du délai sont validés en effectuant une modélisation précise des blocs internes à l'intérieur du FPGA et en exécutant des simulations. Les résultats montrent l’origine des DC qui sont en accord avec les mesures expérimentales de délais. Les modèles proposés au niveau circuit sont, aux meilleures de notre connaissance, le premier travail qui confirme et explique les délais combinatoires dans les FPGA. La conception d'un circuit moniteur de délai pour la détection des DC a été faite dans la deuxième partie de cette thèse. Ce moniteur permet de détecter un changement de délai sur les sections critiques du circuit et de prévenir les pannes de synchronisation engendrées par les SEU sans utiliser la redondance modulaire triple (TMR).----------ABSTRACT The unrelenting demand for electronic components with ever diminishing feature size have emerged new challenges over the years. Among them, more advanced memory and microprocessor semiconductors are being used in avionic systems that exhibit a substantial susceptibility to cosmic radiation phenomena. One of the main implications of cosmic rays, which was primarily observed in orbiting satellites, is single-event effect (SEE). Atmospheric radiation causes several concerns regarding the safety and reliability of avionics equipment, particularly for systems that involve field programmable gate arrays (FPGA). SRAM-based FPGAs, as an attractive solution to implement systems in aeronautic sector, are very susceptible to SEEs in particular Single Event Upset (SEU). An SEU is considered as the change of the state of a bistable element (i.e., bit-flip) due to the effect of an energetic ion or proton. This effect is non-destructive and may be fixed by rewriting the affected part. Sensitivity evaluation of SRAM-based FPGAs to a physical impact such as potential delay changes (DC) has not been addressed thus far in the literature. DCs induced by SEU can affect the functionality of the logic circuits by disturbing the race condition on critical paths. The objective of this thesis is toward the characterization of DCs in SRAM-based FPGAs due to transient ionizing radiation. The DCs observed experimentally are presented and the circuit-level modeling of those DCs is proposed. Circuits involved in delay propagation are reverse-engineered by performing precise modeling of internal blocks inside the FPGA and executing simulations. The results show the root cause of DCs that are in good agreement with experimental delay measurements. The proposed circuit level models are, to the best of our knowledge, the first work on modeling of combinational delays in FPGAs.In addition, the design of a delay monitor circuit for DC detection is investigated in the second part of this thesis. This monitor allowed to show experimentally cumulative DCs on interconnects in FPGA. To this end, by avoiding the use of triple modular redundancy (TMR), a mitigation technique for DCs is proposed and the system downtime is minimized. A method is also proposed to decrease the clock frequency after DC detection without interrupting the process

    Reliability and Performance Analysis of a Fault Tolerant Data Handling Protocol for Aerospace Applications

    Get PDF
    Data communication inside the satellite is one of the most important factors in satellite design. For this purpose, a variety of protocols have been developed in recent years. Controller Area Network (CAN) is one of the well-developed protocols to be used in the On-Board Data Handling (OBDH) systems for communication and geosynchronous satellites. Nonetheless, for aerospace applications which demand radiation hardened integrated circuits, a full featured stand-alone Rad-Hard CAN controller is unavailable. HDL (Hardware Description Language) based IP(Intellectual Property) Cores which are widely developed to be implemented on Rad-Hard FPGAs are more attractive. This paper proposes a novel fault tolerant CAN controller based on FPGAs to provide on-board data handling requirements of the communication satellites. We outline some practical topologies and discuss their complexities and reliability. Despite the fact that the most famous methods like TMR (Triple Modular Redundancy), are very common among designers, the reliability analyses show that these methods are unable to tolerate single upsets in routing matrixes. This paper proposes a robust data bus controller based on dual duplex redundancy on FPGAs. The fault injection experiments reveal that the proposed approach represents better performance respective to the conventional hardware redundancy. Furthermore, the experiments show that the capability of tolerating SEU effects by the proposed method is increased up to 7.17 times with respect to a regular design. The proposed architecture imposes 16.26% and 5.2% overhead in the required resources and the operating frequency in comparison to the regular TMR method

    Développement des techniques de test et de diagnostic pour les FPGA hiérarchique de type mesh

    Get PDF
    The evolution trend of shrinking feature size and increasing complexity in modern electronics is being slowed down due to physical limits that generate numerous imperfections and defects during fabrication steps or projected life time of the chip. Field Programmable Gate Arrays (FPGAs) are used in complex digital systems mainly due to their reconfigurability and shorter time-to-market. To maintain a high reliability of such systems, FPGAs should be tested thoroughly for defects. FPGA architecture optimization for area saving and better signal routability is an ongoing process which directly impacts the overall FPGA testability, hence the reliability. This thesis presents a complete strategy for test and diagnosis of manufacturing defects in mesh-based FPGAs containing a novel multilevel interconnects topology which promises to provide better area and routability. Efficiency of the proposed test schemes is analyzed in terms of test cost, respective fault coverage and diagnostic resolution.L’évolution tendant à réduire la taille et augmenter la complexité des circuits électroniques modernes, est en train de ralentir du fait des limitations technologiques, qui génèrent beaucoup de d’imperfections et de defaults durant la fabrication ou la durée de vie de la puce. Les FPGAs sont utilisés dans les systèmes numériques complexes, essentiellement parce qu’ils sont reconfigurables et rapide à commercialiser. Pour garder une grande fiabilité de tels systèmes, les FPGAs doivent être testés minutieusement pour les defaults. L’optimisation de l’architecture des FPGAs pour l’économie de surface et une meilleure routabilité est un processus continue qui impacte directement la testabilité globale et de ce fait, la fiabilité. Cette thèse présente une stratégie complète pour le test et le diagnostique des defaults de fabrication des “mesh-based FPGA” contenant une nouvelle topologie d’interconnections à plusieurs niveaux, ce qui promet d’apporter une meilleure routabilité. Efficacité des schémas proposes est analysée en termes de temps de test, couverture de faute et résolution de diagnostique

    Methodology of highly reliable systems design

    Get PDF
    Práce představuje alternativní metodiku k již existujícím technikám pro návrh číslicových systémů se zvýšenou spolehlivostí implementovaných do obvodů FPGA a doplňuje některé nové vlastnosti při realizaci a testování těchto systémů. Práce se opírá o využití částečné dynamické rekonfigurace obvodu FPGA při návrhu systémů odolných proti poruchám, kde může být částečná rekonfigurace využita jako mechanizmus pro opravu a zotavení systému po výskytu poruchy. Práce nejprve představuje obecné principy diagnostiky, testování a spolehlivosti číslicových systémů včetně stručného popisu programovatelných obvodů FPGA a jejich architektury. Dále pokračuje přehledem současných metod a technik při návrhu a implementaci systémů odolných proti poruchám do obvodů FPGA, kde jsou popsány zejména techniky z oblasti detekce a lokalizace poruch, opravy a posuzování kvality návrhu. Nejdůležitější částí práce je popis metodiky pro návrh, implementaci a testování systémů odolných proti poruchám, která byla vytvořena pro obvody FPGA, jejichž konfigurační paměť je založena na pamětech typu SRAM. Nejprve je prezentována technika pro vytváření a automatizované generování hlídacích obvodů pro číslicové systémy a komunikační protokoly v FPGA, následně je prezentovaná referenční architektura spolehlivého systému implementovaného do FPGA včetně několika odolných architektur využívajících principu částečné dynamické rekonfigurace jako mechanizmu opravy a zotavení po výskytu poruchy. Dále je popsán způsob řízení rekonfiguračního procesu a testovací platforma pro snadné testovaní a ověření kvality systémů odolných proti poruchám implementovaných dle navržené metodiky. V závěru jsou diskutovány experimentální výsledky a přínos práce.In the thesis, a methodology alternative to existing methods of digital systems design with increased dependability implemented into FPGA is presented, new features which can be used in the implementation and testing of these systems are demonstrated. The research is based on the use of FPGA partial dynamic reconfiguration for the design of fault tolerant systems. In these applications, the partial dynamic reconfiguration can be used as a mechanism to correct the fault and recover the system after the fault occurrence. First, the general principles of diagnostics, testing and digital systems dependability are presented including a brief description of FPGA components and their architectures. Next, a survey of currently used methods and techniques used for the design and implementation of fault tolerant systems into FPGA is described, especially the methods used for fault detection and localization, their correction, together with the principles of evaluating fault tolerant systems design quality.  The most important part of the thesis is seen in the description of the design methodology, implementation and testing of fault tolerant systems implemented into FPGAs which uses SRAMs as the configuration memory. First, the methodology of developing and automated checker components design for digital systems and communication protocols is presented. Then, a reference architecture of a dependable system implemented into FPGA is demonstrated including several fault tolerant architectures based on the use of partial dynamic reconfiguration as the mechanism of fault correction and the recovery from it. The principles of controlling the reconfiguration process are described together with the description of the test platform which allows to test and verify the design of fault tolerant systems based on the methodology presented in the thesis. The experimental results and the contribution of the thesis are discussed in the conclusions.
    corecore