571 research outputs found

    Study of Single Event Transient Error Mitigation

    Get PDF
    Single Event Transient (SET) errors in ground-level electronic devices are a growing concern in the radiation hardening field. However, effective SET mitigation technologies which satisfy ground-level demands such as generic, flexible, efficient, and fast, are limited. The classic Triple Modular Redundancy (TMR) method is the most well-known and popular technique in space and nuclear environment. But it leads to more than 200% area and power overheads, which is too costly to implement in ground-level applications. Meanwhile, the coding technique is extensively utilized to inhibit upset errors in storage cells, but the irregularity of combinatorial logics limits its use in SET mitigation. Therefore, SET mitigation techniques suitable for ground-level applications need to be addressed. Aware of the demands for SET mitigation techniques in ground-level applications, this thesis proposes two novel approaches based on the redundant wire and approximate logic techniques. The Redundant Wire is a SET mitigation technique. By selectively adding redundant wire connections, the technique can prohibit targeted transient faults from propagating on the fly. This thesis proposes a set of signature-based evaluation equations to efficiently estimate the protecting effect provided by each redundant wire candidates. Based on the estimated results, a greedy algorithm is used to insert the best candidate repeatedly. Simulation results substantiate that the evaluation equations can achieve up to 98% accuracy on average. Regarding protecting effects, the technique can mask 18.4% of the faults with a 4.3% area, 4.4% power, and 5.4% delay overhead on average. Overall, the quality of protecting results obtained are 2.8 times better than the previous work. Additionally, the impact of synthesis constraints and signature length are discussed. Approximate Logic is a partial TMR technique offering a trade-off between fault coverage and area overheads. The approximate logic consists of an under-approximate logic and an over-approximate logic. The under-approximate logic is a subset of the original min-terms and the over-approximate logic is a subset of the original max-terms. This thesis proposes a new algorithm for generating the two approximate logics. Through the generating process, the algorithm considers the intrinsic failure probabilities of each gate and utilizes a confidence interval estimate equation to minimize required computations. The technique is applied to two fault models, Stuck-at and SET, and the separate results are compared and discussed. The results show that the technique can reduce the error 75% with an area penalty of 46% on some circuits. The delay overheads of this technique are always two additional layers of logic. The two proposed SET mitigation techniques are both applicable to generic combinatorial logics and with high flexibility. The simulation shows promising SET mitigation ability. The proposed mitigation techniques provide designers more choices in developing reliable combinatorial logic in ground-level applications

    Error Mitigation Using Approximate Logic Circuits: A Comparison of Probabilistic and Evolutionary Approaches

    Get PDF
    Technology scaling poses an increasing challenge to the reliability of digital circuits. Hardware redundancy solutions, such as triple modular redundancy (TMR), produce very high area overhead, so partial redundancy is often used to reduce the overheads. Approximate logic circuits provide a general framework for optimized mitigation of errors arising from a broad class of failure mechanisms, including transient, intermittent, and permanent failures. However, generating an optimal redundant logic circuit that is able to mask the faults with the highest probability while minimizing the area overheads is a challenging problem. In this study, we propose and compare two new approaches to generate approximate logic circuits to be used in a TMR schema. The probabilistic approach approximates a circuit in a greedy manner based on a probabilistic estimation of the error. The evolutionary approach can provide radically different solutions that are hard to reach by other methods. By combining these two approaches, the solution space can be explored in depth. Experimental results demonstrate that the evolutionary approach can produce better solutions, but the probabilistic approach is close. On the other hand, these approaches provide much better scalability than other existing partial redundancy techniques.This work was supported by the Ministry of Economy and Competitiveness of Spain under project ESP2015-68245-C4-1-P, and by the Czech science foundation project GA16-17538S and the Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science - LQ1602

    Implementation and Characterization of AHR on a Xilinx FPGA

    Get PDF
    A new version of the Adaptive-Hybrid Redundancy (AHR) architecture was developed to be implemented and tested in hardware using Commercial-Off-The-Shelf (COTS) Field-Programmable Gate Arrays (FPGAs). The AHR architecture was developed to mitigate the effects that the Single Event Upset (SEU) and Single Event Transient (SET) radiation effects have on processors and was tested on a Microprocessor without Interlocked Pipeline Stages (MIPS) architecture. The AHR MIPS architecture was implemented in hardware using two Xilinx FPGAs. A Universal Asynchronous Receiver Transmitter (UART) based serial communication network was added to the AHR MIPS design to enable inter-board communication between the two FPGAs. The runtime performance of AHR MIPS was measured in hardware and compared against the runtime performance of standalone TMR and TSR MIPS architectures. The hardware implementation of AHR MIPS demonstrated flexible runtime performance that was nearly as fast as TMR MIPS, never as slow as TSR MIPS, and demonstrated performance in between those extremes. Hardware testing and verification of AHR MIPS showed that the AHR mitigation strategy presents a large performance tradespace, where a user can adjust both the runtime processor performance and radiation tolerance to fit the constraints of a space mission, while also continuing to provide adaptive performance based upon the current radiation environment

    Cross-layer Soft Error Analysis and Mitigation at Nanoscale Technologies

    Get PDF
    This thesis addresses the challenge of soft error modeling and mitigation in nansoscale technology nodes and pushes the state-of-the-art forward by proposing novel modeling, analyze and mitigation techniques. The proposed soft error sensitivity analysis platform accurately models both error generation and propagation starting from a technology dependent device level simulations all the way to workload dependent application level analysis

    Methods and architectures based on modular redundancy for fault-tolerant combinational circuits

    Get PDF
    Dans cette thèse, nous nous intéressons à la recherche d architectures fiables pour les circuits logiques. Par fiable , nous entendons des architectures permettant le masquage des fautes et les rendant de ce fait tolérantes" à ces fautes. Les solutions pour la tolérance aux fautes sont basées sur la redondance, d où le surcoût qui y est associé. La redondance peut être mise en oeuvre de différentes manières : statique ou dynamique, spatiale ou temporelle. Nous menons cette recherche en essayant de minimiser tant que possible le surcoût matériel engendré par le mécanisme de tolérance aux fautes. Le travail porte principalement sur les solutions de redondance modulaire, mais certaines études développées sont beaucoup plus générales.In this thesis, we mainly take into account the representative technique Triple Module Redundancy (TMR) as the reliability improvement technique. A voter is an necessary element in this kind of fault-tolerant architectures. The importance of reliability in majority voter is due to its application in both conventional fault-tolerant design and novel nanoelectronic systems. The property of a voter is therefore a bottleneck since it directly determines the whole performance of a redundant fault-tolerant digital IP (such as a TMR configuration). Obviously, the efficacy of TMR is to increase the reliability of digital IP. However, TMR sometimes could result in worse reliability than a simplex function module could. A better understanding of functional and signal reliability characteristics of a 3-input majority voter (majority voting in TMR) is studied. We analyze them by utilizing signal probability and boolean difference. It is well known that the acquisition of output signal probabilities is much easier compared with the obtention of output reliability. The results derived in this thesis proclaim the signal probability requirements for inputs of majority voter, and thereby reveal the conditions that TMR technique requires. This study shows the critical importance of error characteristics of majority voter, as used in fault-tolerant designs. As the flawlessness of majority voter in TMR is not true, we also proposed a fault-tolerant and simple 2-level majority voter structure for TMR. This alternative architecture for majority voter is useful in TMR schemes. The proposed solution is robust to single fault and exceeds those previous ones in terms of reliability.PARIS-Télécom ParisTech (751132302) / SudocSudocFranceF

    Designing reliable cyber-physical systems overview associated to the special session at FDL’16

    Get PDF
    CPS, that consist of a cyber part – a computing system – and a physical part – the system in the physical environment – as well as the respective interfaces between those parts, are omnipresent in our daily lives. The application in the physical environment drives the overall requirements that must be respected when designing the computing system. Here, reliability is a core aspect where some of the most pressing design challenges are: • monitoring failures throughout the computing system, • determining the impact of failures on the application constraints, and • ensuring correctness of the computing system with respect to application-driven requirements rooted in the physical environment. This paper provides an overview of techniques discussed in the special session to tackle these challenges throughout the stack of layers of the computing system while tightly coupling the design methodology to the physical requirements.</p
    corecore