178 research outputs found

    Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy

    Get PDF
    The operating clock frequency is determined by the longest signal propagation delay, setup/hold time, and timing margin. These are becoming less predictable with the increasing design complexity and process miniaturization. The difficult challenge is then to ensure that a device operating at its clock frequency is error-free with quantifiable assurance. Effort at device-level engineering will not suffice for these circuits exhibiting wide process variation and heightened sensitivities to operating condition stress. Logic-level redress of this issue is a necessity and we propose a design-level remedy for this timing-uncertainty problem. The aim of the design and analysis approaches presented in this dissertation is to provide framework, SABRE, wherein an increased operating clock frequency can be achieved. The approach is a combination of analytical modeling, experimental analy- sis, hardware /time-redundancy design, exception handling and recovery techniques. Our proposed design replicates only a necessary part of the original circuit to avoid high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical combinational circuit is path-wise partitioned into two sections. The combinational circuits associated with long paths are laid out without any intrusion except for the fan-out connections from the first section of the circuit to a replicated second section of the combinational circuit. Thus only the second section of the circuit is replicated. The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the likelihood of mistiming due to stress or process variation is eliminated. During the subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved, circuit outputs are compared to detect faults. When a fault is detected, the retry sig- nal is triggered and the dynamic frequency-step-down takes place before a pipe flush, and retry is issued. The significant timing overhead associated with the retry is offset by the rarity of the timing violation events. Simulation results on ISCAS Benchmark circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware overhead of replicated timing-critical circuit

    STAHL: A Novel Scan-Test-Aware Hardened Latch Design

    Get PDF
    As modern technology nodes become more susceptible to soft errors, many radiation hardened latch designs have been proposed. However, redundant circuitry used to tolerate soft errors in such hardened latches also reduces the test coverage of cell-internal manufacturing defects. To avoid potential test escapes that lead to soft error vulnerability and reliability issues, this paper proposes a novel Scan-Test-Aware Hardened Latch (STAHL). Simulation results show that STAHL has superior defect coverage compared to previous hardened latches while maintaining full radiation hardening in function mode.24th IEEE European Test Symposium (ETS\u2719), May 27-31, 2019, Baden-Baden, German

    Radiation Tolerant Electronics, Volume II

    Get PDF
    Research on radiation tolerant electronics has increased rapidly over the last few years, resulting in many interesting approaches to model radiation effects and design radiation hardened integrated circuits and embedded systems. This research is strongly driven by the growing need for radiation hardened electronics for space applications, high-energy physics experiments such as those on the large hadron collider at CERN, and many terrestrial nuclear applications, including nuclear energy and safety management. With the progressive scaling of integrated circuit technologies and the growing complexity of electronic systems, their ionizing radiation susceptibility has raised many exciting challenges, which are expected to drive research in the coming decade.After the success of the first Special Issue on Radiation Tolerant Electronics, the current Special Issue features thirteen articles highlighting recent breakthroughs in radiation tolerant integrated circuit design, fault tolerance in FPGAs, radiation effects in semiconductor materials and advanced IC technologies and modelling of radiation effects

    Développement de circuits logiques programmables résistants aux alas logiques en technologie CMOS submicrométrique

    Get PDF
    The electronics associated to the particle detectors of the Large Hadron Collider (LHC), under construction at CERN, will operate in a very harsh radiation environment. Most of the microelectronics components developed for the first generation of LHC experiments have been designed with very precise experiment-specific goals and are hardly adaptable to other applications. Commercial Off-The-Shelf (COTS) components cannot be used in the vicinity of particle collision due to their poor radiation tolerance. This thesis is a contribution to the effort to cover the need for radiation-tolerant SEU-robust programmable components for application in High Energy Physics (HEP) experiments. Two components are under development: a Programmable Logic Device (PLD) and a Field-Programmable Gate Array (FPGA). The PLD is a fuse-based, 10-input, 8-I/O general architecture device in 0.25 micron CMOS technology. The FPGA under development is instead a 32x32 logic block array, equivalent to ~25k gates, in 0.13 micron CMOS. This work focussed also on the research for an SEU-robust register in both the mentioned technologies. The SEU-robust register is employed as a user data flip-flop in the FPGA and PLD designs and as a configuration cell as well in the FPGA design

    Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy

    Get PDF
    The operating clock frequency is determined by the longest signal propagation delay, setup/hold time, and timing margin. These are becoming less predictable with the increasing design complexity and process miniaturization. The difficult challenge is then to ensure that a device operating at its clock frequency is error-free with quantifiable assurance. Effort at device-level engineering will not suffice for these circuits exhibiting wide process variation and heightened sensitivities to operating condition stress. Logic-level redress of this issue is a necessity and we propose a design-level remedy for this timing-uncertainty problem. The aim of the design and analysis approaches presented in this dissertation is to provide framework, SABRE, wherein an increased operating clock frequency can be achieved. The approach is a combination of analytical modeling, experimental analy- sis, hardware /time-redundancy design, exception handling and recovery techniques. Our proposed design replicates only a necessary part of the original circuit to avoid high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical combinational circuit is path-wise partitioned into two sections. The combinational circuits associated with long paths are laid out without any intrusion except for the fan-out connections from the first section of the circuit to a replicated second section of the combinational circuit. Thus only the second section of the circuit is replicated. The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the likelihood of mistiming due to stress or process variation is eliminated. During the subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved, circuit outputs are compared to detect faults. When a fault is detected, the retry sig- nal is triggered and the dynamic frequency-step-down takes place before a pipe flush, and retry is issued. The significant timing overhead associated with the retry is offset by the rarity of the timing violation events. Simulation results on ISCAS Benchmark circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware overhead of replicated timing-critical circuit

    Exploration and Analysis of Combinations of Hamming Codes in 32-bit Memories

    Full text link
    Reducing the threshold voltage of electronic devices increases their sensitivity to electromagnetic radiation dramatically, increasing the probability of changing the memory cells' content. Designers mitigate failures using techniques such as Error Correction Codes (ECCs) to maintain information integrity. Although there are several studies of ECC usage in spatial application memories, there is still no consensus in choosing the type of ECC as well as its organization in memory. This work analyzes some configurations of the Hamming codes applied to 32-bit memories in order to use these memories in spatial applications. This work proposes the use of three types of Hamming codes: Ham(31,26), Ham(15,11), and Ham(7,4), as well as combinations of these codes. We employed 36 error patterns, ranging from one to four bit-flips, to analyze these codes. The experimental results show that the Ham(31,26) configuration, containing five bits of redundancy, obtained the highest rate of simple error correction, almost 97\%, with double, triple, and quadruple error correction rates being 78.7\%, 63.4\%, and 31.4\%, respectively. While an ECC configuration encompassed four Ham(7.4), which uses twelve bits of redundancy, only fixes 87.5\% of simple errors

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF

    耐ソフトエラーラッチにおける欠陥の分析、検出及び評価に関する研究

    Get PDF
    The development of modern integrated circuits (ICs) has greatly changed the life of humankind. Nowadays, IC s are also indispensable to mission-critical applications, such as medical devices, autonomous cars, aircraft navigating systems, and satellites. The reliability of these mission-critical applications is a major concern. A soft-error occurring in an IC is a severe threat to its reliability, especially for mission-critical applications. The continuous trend of shrinking technology feature sizes makes modern ICs more and more vulnerable to soft errors. Soft-errors are caused by radiation particles striking an IC and generating current pulses to disturb its functionality. A soft-error can cause data corruption and may eventually lead to system failure s If a soft-error occurs in an operational medical device during surgery, it may cause a malfunction of this device and interrupt the surgery process. A soft-error may change the control data of an autonomous car which may lead to an accident. A soft-error may corrupt the aircraft navigating systems. No one would take the chance to let it happen even though malfunction s caused by soft errors can be solved by resetting these devices. Because reset takes time and severe results may happen during the resetting. If a soft-error causes a malfunction in the control system of a satellite, it may not be able to maintain its height and eventually burn up as it falls into the Earth’s atmosphere. Hence, it is important to protect ICs from soft errors. Many soft-error tolerance methods have been proposed to protect ICs against soft-errors. In an IC, memory elements and storage elements (e.g., latches and flip flops) are the most vulnerable to soft-errors, and data stored in them are crucial to the operation of a circuit. Error correction codes (ECCs) can be u sed to protect memories. Register-level soft-error tolerance methods can be used to detect soft-errors in latches by using parity checking and correct them by resetting. Hardened designs protect latches against soft-errors by using redundant feedback loops to store the same input data and using a voter to select the correct output. The advantage of using hardened designs is that they can prevent soft-errors from reaching outputs while ECCs and register-level soft-error tolerance methods must detect soft-errors and then correct them by restoring the data. For protecting storage elements in mission-critical applications, hardened latch design is the best option because it has high reliability and can save the resetting time. Many state-of-the-art hardened latch designs have been proposed to tolerate soft errors and they are believed to have good soft-error tolerability. Defects (physical flaws due to imperfect production (production defects) and physical changes caused by aging effects after a long operation time (aging-related defects) can also cause a malfunction of a circuit and cause a system failure eventually. Different from the temporal state change of a circuit caused by soft errors, defects are permanent damages to a circuit and can disturb the behavior of a circuit from its desired manner. Defects in storage elements should be detected to make sure a system/device operating correctly and stably. Scan test is a commonly used defect detection method, which connects reconfigured storage elements to form a shift register with external access and the internal states of these storage elements can be easily controlled and checked. However, the impact of defects on existing state of the art hardened latch design has not been considered. This impact requires consideration because added redundancy in hardened latch designs can not only mask soft-errors but also mask the effects of defects and it can lead to two serious problems: Problem-1 (Low Testability): Production defects in hardened latch designs are difficult to detect with conventional scan tests, in which the observability (an important metric to evaluate a circuit’s testability) of defects in hardened latch designs can be greatly reduced. Therefore, existing state-of-the-art hardened latches have low observability and thus low testability. Furthermore, defects that escaped the production test (undetected defects) may become more and more serious and cause a system failure eventually. Problem-2 (Low Soft-Error Tolerability): Undetected defects and aging-related defects can make hardened latch designs vulnerable to soft-errors while defect-free ones do not. The soft-error tolerability of hardened latch designs may be compromise d by undetected defects or aging related defects. This research is the first to consider Problem-1 of low testability of hardened latches and Problem-2 of defects reducing the reliability of hardened latches. Furthermore, this research is the first to pro pose a comprehensive solution to solve these two problems with the following five major contributions: Contribution-1: A first of its kind metric for quantifying the impact of defects on hardened latches, called Post-Test Vulnerability Factor (PTVF). It is used to analyze the residual soft-error tolerability of hardened latches after testing. Problem-2 is solved by this first major contribution. Contribution-2: A novel design called Scan-Test-Aware Hardened Latch (STAHL) that provides the highest defect coverage in comparison with all existing hardened latches. Problem-1 is solved by using STAHL to build a scan c ell to perform a scan test. Contribution-3: A novel scan test procedure is proposed to solve Problem-1 by fully testing the STAHL based scan cell. Contribution-4: A novel High-Performance Scan-Test-Aware Hardened Latch (HP-STAHL) design can also solve Problem-1 and has similar defect coverage as STAHL but has lower power consumption and higher propagation speed. Contribution-5: A novel scan test procedure is proposed to fully test the HP STAHL-based scan cell to solve Problem-1. Comprehensive simulation results demonstrate the accuracy of the PTVF metric and the effectiveness of the STAHL-based scan test and HP-STAHL-based scan test. As the first comprehensive study bridging the gap between hardened latch design s and IC testing, the findings of this research are expected to significantly improve the soft-error-related reliability of IC designs for mission-critical applications. Furthermore, the two proposed hardened latches and the scan test procedures can not only be use d to detect defects after production but also can be applied to detect aging related defects in the field through performing built-in self-test (BIST). In Chapter 1, an example is introduced to indicate Problem-1 and Problem-2. Chapter 2 shows the background information of soft-errors and defects. Chapter 3 shows some typical soft-error mitigation methods and details of a scan test. Chapter 4 describes the detailed information of PTVF Contribution-1). Chapter 5 shows the structure of STAHL (Contribution-2) and Chapter 6 shows the scan test procedure of testing the STAHL-based scan cell (Contribution-3). Chapter 7 shows the structure of HP-STAHL (Contribution-4) and Chapter 8 shows the scan test procedure of testing the HP-STAHL based scan cell (Contribution-5). Chapter 9 shows the experimental results of comparing STAHL and HP-STAHL with state-of-the-art hardened latch designs. Chapter 10 concludes this thesis.九州工業大学博士学位論文 学位記番号:情工博甲第371号 学位授与年月日:令和4年9月26日1. Introduction|2. Background|3. Related Works|4. Post-Test Vulnerability Factor (PTVF)|5. Scan-Test Aware Hardened Latch (STAHL)|6. Scan Test Based on STAHL|7. High Performance Scan-Test-Aware Hardened Latch (HP STAHL)|8. Scan Test Based on HP STAHL|9. Experimental Evaluation|10. Conclusions and Future Works九州工業大学令和4年
    corecore