9 research outputs found

    Exploration and Analysis of Combinations of Hamming Codes in 32-bit Memories

    Full text link
    Reducing the threshold voltage of electronic devices increases their sensitivity to electromagnetic radiation dramatically, increasing the probability of changing the memory cells' content. Designers mitigate failures using techniques such as Error Correction Codes (ECCs) to maintain information integrity. Although there are several studies of ECC usage in spatial application memories, there is still no consensus in choosing the type of ECC as well as its organization in memory. This work analyzes some configurations of the Hamming codes applied to 32-bit memories in order to use these memories in spatial applications. This work proposes the use of three types of Hamming codes: Ham(31,26), Ham(15,11), and Ham(7,4), as well as combinations of these codes. We employed 36 error patterns, ranging from one to four bit-flips, to analyze these codes. The experimental results show that the Ham(31,26) configuration, containing five bits of redundancy, obtained the highest rate of simple error correction, almost 97\%, with double, triple, and quadruple error correction rates being 78.7\%, 63.4\%, and 31.4\%, respectively. While an ECC configuration encompassed four Ham(7.4), which uses twelve bits of redundancy, only fixes 87.5\% of simple errors

    Dependability Analysis Methodology for FPGA-Based UAV Communication Protocols using UPPAAL-SMC

    Get PDF
    UAVs are multifaceted devices that have enormous versatility and flexibility in a plethora of various fields. Year over year, UAVs see a tremendous amount of research invested in it to make them more efficient and autonomous when performing a task. This increase in autonomy requires the UAVs to have a dependable link between them to exchange crucial information like current position and speed. These messages are transmitted to avoid collisions and perform missions efficiently. The communication between UAVs depends on several factors like the used telemetry device, distance between the UAVs, speed of the UAVs, and application environment. Hence, an UAV designer must analyze the reliability of the communication based on the desired application environment and necessary communication components in the UAV. Faults can also propagate in UAV components built using the FPGA technology when they are placed in harsh radiation environments like radiation monitoring. These errors can lead to complications in the operation of an UAV communication component, and hence, FPGAs require techniques like blind scrubbing to mitigate these faults. The availability of the communication component can be impacted when using this mitigation approach. Therefore, investigating the optimal configuration to maintain high and consistent availability is crucial. This thesis presents a methodology to perform high-level dependability analysis for UAV communication protocols using statistical model checking. First, we evaluate the reliability of a point-to-point UAV communication using the UAV-UAV framework. The main objective of this framework is to investigate the link reliability between UAVs based on the specifications of the telemetry device and the availability of the communication components. To accomplish this, we propose models to emulate the behavior of two UAVs in air, the condition of the transmitter and receiver, and the data exchange phase between two UAVs. Then, we analyze the availability of an UAV communication module in a harsh radiation environment using blind scrubbing as a mitigation approach. The peak availability of UAV-UAV and UAV-GCS communication components is investigated through the UAV-UAV and UAV-GCS frameworks. The two frameworks utilize the SEU rate computed from the RTL code of the communication component design. Then, implement crucial features like scrubbing interval and scrub time in the transmitter and receiver modules to find the optimal scrubbing interval when the UAV communications with other UAVs or the GCS. Finally, the effect of these faults and limitations of blind scrubbing is also investigated in our work

    Dependability modeling and optimization of triple modular redundancy partitioning for SRAM-based FPGAs

    Full text link
    SRAM-based FPGAs are popular in the aerospace industry for their field programmability and low cost. However, they suffer from cosmic radiation-induced Single Event Upsets (SEUs). Triple Modular Redundancy (TMR) is a well-known technique to mitigate SEUs in FPGAs that is often used with another SEU mitigation technique known as configuration scrubbing. Traditional TMR provides protection against a single fault at a time, while partitioned TMR provides improved reliability and availability. In this paper, we present a methodology to analyze TMR partitioning at early design stage using probabilistic model checking. The proposed formal model can capture both single and multiple-cell upset scenarios, regardless of any assumption of equal partition sizes. Starting with a high-level description of a design, a Markov model is constructed from the Data Flow Graph (DFG) using a specified number of partitions, a component characterization library and a user defined scrub rate. Such a model and exhaustive analysis captures all the considered failures and repairs possible in the system within the radiation environment. Various reliability and availability properties are then verified automatically using the PRISM model checker exploring the relationship between the scrub frequency and the number of TMR partitions required to meet the design requirements. Also, the reported results show that based on a known voter failure rate, it is possible to find an optimal number of partitions at early design stages using our proposed method.Comment: Published in Reliability Engineering & System Safety Volume 182, February 2019, Pages 107-11

    Maintenance of Smart Buildings using Fault Trees

    Get PDF
    Timely maintenance is an important means of increasing system dependability and life span. Fault Maintenance trees (FMTs) are an innovative framework incorporating both maintenance strategies and degradation models and serve as a good planning platform for balancing total costs (operational and maintenance) with dependability of a system. In this work, we apply the FMT formalism to a {Smart Building} application and propose a framework that efficiently encodes the FMT into Continuous Time Markov Chains. This allows us to obtain system dependability metrics such as system reliability and mean time to failure, as well as costs of maintenance and failures over time, for different maintenance policies. We illustrate the pertinence of our approach by evaluating various dependability metrics and maintenance strategies of a Heating, Ventilation and Air-Conditioning system.Comment: arXiv admin note: substantial text overlap with arXiv:1801.0426

    Novel fault tolerant Multi-Bit Upset (MBU) Error-Detection and Correction (EDAC) architecture

    Get PDF
    Desde el punto de vista de seguridad, la certificaci贸n aeron谩utica de aplicaciones cr铆ticas de vuelo requiere diferentes t茅cnicas que son usadas para prevenir fallos en los equipos electr贸nicos. Los fallos de tipo hardware debido a la radiaci贸n solar que existe a las alturas standard de vuelo, como SEU (Single Event Upset) y MCU (Multiple Bit Upset), provocan un cambio de estado de los bits que soportan la informaci贸n almacenada en memoria. Estos fallos se producen, por ejemplo, en la memoria de configuraci贸n de una FPGA, que es donde se definen todas las funcionalidades. Las t茅cnicas de protecci贸n requieren normalmente de redundancias que incrementan el coste, n煤mero de componentes, tama帽o de la memoria y peso. En la fase de desarrollo de aplicaciones cr铆ticas de vuelo, generalmente se utilizan una serie de est谩ndares o recomendaciones de dise帽o como ABD100, RTCA DO-160, IEC62395, etc, y diferentes t茅cnicas de protecci贸n para evitar fallos del tipo SEU o MCU. Estas t茅cnicas est谩n basadas en procesos tecnol贸gicos espec铆ficos como memorias robustas, codificaciones para detecci贸n y correcci贸n de errores (EDAC), redundancias software, redundancia modular triple (TMR) o soluciones a nivel sistema. Esta tesis est谩 enfocada a minimizar e incluso suprimir los efectos de los SEUs y MCUs que particularmente ocurren en la electr贸nica de avi贸n como consecuencia de la exposici贸n a radiaci贸n de part铆culas no cargadas (como son los neutrones) que se encuentra potenciada a las t铆picas alturas de vuelo. La criticidad en vuelo que tienen determinados sistemas obligan a que dichos sistemas sean tolerantes a fallos, es decir, que garanticen un correcto funcionamiento a煤n cuando se produzca un fallo en ellos. Es por ello que soluciones como las presentadas en esta tesis tienen inter茅s en el sector industrial. La Tesis incluye una descripci贸n inicial de la f铆sica de la radiaci贸n incidente sobre aeronaves, y el an谩lisis de sus efectos en los componentes electr贸nicos aerona煤ticos basados en semiconductor, que desembocan en la generaci贸n de SEUs y MCUs. Este an谩lisis permite dimensionar adecuadamente y optimizar los procedimientos de correcci贸n que se propongan posteriormente. La Tesis propone un sistema de correcci贸n de fallos SEUs y MCUs que permita cumplir la condici贸n de Sistema Tolerante a Fallos, a la vez que minimiza los niveles de redundancia y de complejidad de los c贸digos de correcci贸n. El nivel de redundancia es minimizado con la introducci贸n del concepto propuesto HSB (Hardwired Seed Bits), en la que se reduce la informaci贸n esencial a unos pocos bits semilla, neutros frente a radiaci贸n. Los c贸digos de correcci贸n requeridos se reducen a la correcci贸n de un 煤nico error, gracias al uso del concepto de Distancia Virtual entre Bits, a partir del cual ser谩 posible corregir m煤ltiples errores simult谩neos (MCUs) a partir de c贸digos simples de correcci贸n. Un ejemplo de aplicaci贸n de la Tesis es la implementaci贸n de una Protecci贸n Tolerante a Fallos sobre la memoria SRAM de una FPGA. Esto significa que queda protegida no s贸lo la informaci贸n contenida en la memoria sino que tambi茅n queda auto-protegida la funci贸n de protecci贸n misma almacenada en la propia SRAM. De esta forma, el sistema es capaz de auto-regenerarse ante un SEU o incluso un MCU, independientemente de la zona de la SRAM sobre la que impacte la radiaci贸n. Adicionalmente, esto se consigue con c贸digos simples tales como correcci贸n por bit de paridad y Hamming, minimizando la dedicaci贸n de recursos de computaci贸n hacia tareas de supervisi贸n del sistema.For airborne safety critical applications certification, different techniques are implemented to prevent failures in electronic equipments. The HW failures at flying heights of aircrafts related to solar radiation such as SEU (Single-Event-Upset) and MCU (Multiple Bit Upset), causes bits alterations that corrupt the information at memories. These HW failures cause errors, for example, in the Configuration-Code of an FPGA that defines the functionalities. The protection techniques require classically redundant functionalities that increases the cost, components, memory space and weight. During the development phase for airborne safety critical applications, different aerospace standards are generally recommended as ABD100, RTCA-DO160, IEC62395, etc, and different techniques are classically used to avoid failures such as SEU or MCU. These techniques are based on specific technology processes, Hardened memories, error detection and correction codes (EDAC), SW redundancy, Triple Modular Redundancy (TMR) or System level solutions. This Thesis is focussed to minimize, and even to remove, the effects of SEUs and MCUs, that particularly occurs in the airborne electronics as a consequence of its exposition to solar radiation of non-charged particles (for example the neutrons). These non-charged particles are even powered at flying altitudes due to aircraft volume. The safety categorization of different equipments/functionalities requires a design based on fault-tolerant approach that means, the system will continue its normal operation even if a failure occurs. The solution proposed in this Thesis is relevant for the industrial sector because of its Fault-tolerant capability. Thesis includes an initial description for the physics of the solar radiation that affects into aircrafts, and also the analyses of their effects into the airborne electronics based on semiconductor components that create the SEUs and MCUs. This detailed analysis allows the correct sizing and also the optimization of the procedures used to correct the errors. This Thesis proposes a system that corrects the SEUs and MCUs allowing the fulfilment of the Fault-Tolerant requirement, reducing the redundancy resources and also the complexity of the correction codes. The redundancy resources are minimized thanks to the introduction of the concept of HSB (Hardwired Seed Bits), in which the essential information is reduced to a few seed bits, neutral to radiation. The correction codes required are reduced to the correction of one error thanks to the use of the concept of interleaving distance between adjacent bits, this allows the simultaneous multiple error correction with simple single error correcting codes. An example of the application of this Thesis is the implementation of the Fault-tolerant architecture of an SRAM-based FPGA. That means that the information saved in the memory is protected but also the correction functionality is auto protected as well, also saved into SRAM memory. In this way, the system is able to self-regenerate the information lost in case of SEUs or MCUs. This is independent of the SRAM area affected by the radiation. Furthermore, this performance is achieved by means simple error correcting codes, as parity bits or Hamming, that minimize the use of computational resources to this supervision tasks for system.Programa Oficial de Doctorado en Ingenier铆a El茅ctrica, Electr贸nica y Autom谩ticaPresidente: Luis Alfonso Entrena Arrontes.- Secretario: Pedro Reviriego Vasallo.- Vocal: M陋 Luisa L贸pez Vallej

    Early Dependability Analysis of FPGA-Based Space Applications Using Formal Verification

    Get PDF
    SRAM-based FPGAs are increasingly attractive in the aerospace industry for their field programmability and low cost. Unfortunately, they suffer from cosmic radiation induced Single Event Effects (SEEs). In safety-critical applications, the dependability of the design is a prime concern since failures may have catastrophic consequences. Hence, an early analysis of dependability of such safety-critical applications will enable designers to develop systems that meet high dependability requirements, such as the DO-254 standard. In this thesis, we propose a high-level dependability and performability analysis methodology based on probabilistic model checking. Compared to the pen-and-pencil and discrete-event simulation approach, our methodology is more accurate due to the use of an automated formal verification technique. Moreover, compared to fault injection or beam testing, analysis at early design stages can guide designers to build more reliable designs reducing the overall cost and effort. The proposed methodology can perform three different types of analysis: evaluation of available design options, optimization of scrub intervals while satisfying its design assurance level requirements, and optimal partitioning of Triple-Modular Redundant (TMR) Systems. Such analysis can also guide designers to adopt proper mitigation technique(s), such as rescheduling, TMR, TMR with less frequent scrubs, or even can help to decide the number of TMR partitions for a given scrub intervals. Starting from a high-level description of a system, based on the preferred analysis, a Markov model or Markov (reward) model is constructed from the extracted Control Data Flow Graph (CDFG) and the failure/mitigation parameters for the targeted FPGA. Such modeling and exhaustive analysis elaborated using a probabilistic model checking technique can capture all the failures and repairs possible (according to some general model) in the system within the radiation environment. To illustrate the applicability of the proposed approach, we present our quantitative analysis obtained from DSP benchmark circuits

    Rapport annuel 2014

    Get PDF
    corecore