1,403 research outputs found

    A combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip

    Get PDF
    In this paper we develop a combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip. The method assumes that defects are produced according to a model in which defects are lethal and affect given components of the system following a distribution common to all defects. The distribution of the number of defects is arbitrary. The method is based on the formulation of the yield as 1 minus the probability that a given boolean function with multiple-valued variables has value 1. That probability is computed by analyzing a ROMDD (reduced ordered multiple-valuedecision diagram) representation of the function. For efficiency reasons, we first build a coded ROBDD (reduced ordered binary decision diagram) representation of the function and then transform that coded ROBDD into the ROMDD required by the method. We present numerical experiments showing that the method is able to cope with quite large systems in moderate CPU times.Postprint (published version

    Combinatorial methods for the evaluation of yield and operational reliability of fault-tolerant systems-on-chip

    Get PDF
    In this paper we develop combinatorial methods for the evaluation of yield and operational reliability of fault-tolerant systems-on-chip. The method for yield computation assumes that defects are produced according to a model in which defects are lethal and affect given components of the system following a distribution common to all defects; the method for the computation of operational reliability also assumes that the fault-tree function of the system is increasing. The distribution of the number of defects is arbitrary. The methods are based on the formulation of, respectively, the yield and the operational reliability as the probability that a given boolean function with multiple-valued variables has value 1. That probability is computed by analyzing a ROMDD (reduced ordered multiple-value decision diagram) representation of the function. For efficiency reasons, a coded ROBDD (reduced ordered binary decision diagram) representation of the function is built first and, then, that coded ROBDD is transformed into the ROMDD required by the methods. We present numerical experiments showing that the methods are able to cope with quite large systems in moderate CPU times.Postprint (published version

    A Simplified Method to Calculate Failure Times in Fault-Tolerant Systems

    Get PDF
    A simplified method is presented to calculate moments of failure time and residual lifetime of a fault-tolerant system. The method is based on recent results in queueing theory. Its effectiveness is illustrated by considering a dual repairable system from the literature

    Space Station Freedom data management system growth and evolution report

    Get PDF
    The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS

    Delay Measurements and Self Characterisation on FPGAs

    No full text
    This thesis examines new timing measurement methods for self delay characterisation of Field-Programmable Gate Arrays (FPGAs) components and delay measurement of complex circuits on FPGAs. Two novel measurement techniques based on analysis of a circuit's output failure rate and transition probability is proposed for accurate, precise and efficient measurement of propagation delays. The transition probability based method is especially attractive, since it requires no modifications in the circuit-under-test and requires little hardware resources, making it an ideal method for physical delay analysis of FPGA circuits. The relentless advancements in process technology has led to smaller and denser transistors in integrated circuits. While FPGA users benefit from this in terms of increased hardware resources for more complex designs, the actual productivity with FPGA in terms of timing performance (operating frequency, latency and throughput) has lagged behind the potential improvements from the improved technology due to delay variability in FPGA components and the inaccuracy of timing models used in FPGA timing analysis. The ability to measure delay of any arbitrary circuit on FPGA offers many opportunities for on-chip characterisation and physical timing analysis, allowing delay variability to be accurately tracked and variation-aware optimisations to be developed, reducing the productivity gap observed in today's FPGA designs. The measurement techniques are developed into complete self measurement and characterisation platforms in this thesis, demonstrating their practical uses in actual FPGA hardware for cross-chip delay characterisation and accurate delay measurement of both complex combinatorial and sequential circuits, further reinforcing their positions in solving the delay variability problem in FPGAs

    inSense: A Variation and Fault Tolerant Architecture for Nanoscale Devices

    Get PDF
    Transistor technology scaling has been the driving force in improving the size, speed, and power consumption of digital systems. As devices approach atomic size, however, their reliability and performance are increasingly compromised due to reduced noise margins, difficulties in fabrication, and emergent nano-scale phenomena. Scaled CMOS devices, in particular, suffer from process variations such as random dopant fluctuation (RDF) and line edge roughness (LER), transistor degradation mechanisms such as negative-bias temperature instability (NBTI) and hot-carrier injection (HCI), and increased sensitivity to single event upsets (SEUs). Consequently, future devices may exhibit reduced performance, diminished lifetimes, and poor reliability. This research proposes a variation and fault tolerant architecture, the inSense architecture, as a circuit-level solution to the problems induced by the aforementioned phenomena. The inSense architecture entails augmenting circuits with introspective and sensory capabilities which are able to dynamically detect and compensate for process variations, transistor degradation, and soft errors. This approach creates ``smart\u27\u27 circuits able to function despite the use of unreliable devices and is applicable to current CMOS technology as well as next-generation devices using new materials and structures. Furthermore, this work presents an automated prototype implementation of the inSense architecture targeted to CMOS devices and is evaluated via implementation in ISCAS \u2785 benchmark circuits. The automated prototype implementation is functionally verified and characterized: it is found that error detection capability (with error windows from \approx30-400ps) can be added for less than 2\% area overhead for circuits of non-trivial complexity. Single event transient (SET) detection capability (configurable with target set-points) is found to be functional, although it generally tracks the standard DMR implementation with respect to overheads

    Sustainable Fault-handling Of Reconfigurable Logic Using Throughput-driven Assessment

    Get PDF
    A sustainable Evolvable Hardware (EH) system is developed for SRAM-based reconfigurable Field Programmable Gate Arrays (FPGAs) using outlier detection and group testing-based assessment principles. The fault diagnosis methods presented herein leverage throughput-driven, relative fitness assessment to maintain resource viability autonomously. Group testing-based techniques are developed for adaptive input-driven fault isolation in FPGAs, without the need for exhaustive testing or coding-based evaluation. The techniques maintain the device operational, and when possible generate validated outputs throughout the repair process. Adaptive fault isolation methods based on discrepancy-enabled pair-wise comparisons are developed. By observing the discrepancy characteristics of multiple Concurrent Error Detection (CED) configurations, a method for robust detection of faults is developed based on pairwise parallel evaluation using Discrepancy Mirror logic. The results from the analytical FPGA model are demonstrated via a self-healing, self-organizing evolvable hardware system. Reconfigurability of the SRAM-based FPGA is leveraged to identify logic resource faults which are successively excluded by group testing using alternate device configurations. This simplifies the system architect\u27s role to definition of functionality using a high-level Hardware Description Language (HDL) and system-level performance versus availability operating point. System availability, throughput, and mean time to isolate faults are monitored and maintained using an Observer-Controller model. Results are demonstrated using a Data Encryption Standard (DES) core that occupies approximately 305 FPGA slices on a Xilinx Virtex-II Pro FPGA. With a single simulated stuck-at-fault, the system identifies a completely validated replacement configuration within three to five positive tests. The approach demonstrates a readily-implemented yet robust organic hardware application framework featuring a high degree of autonomous self-control

    Reliability analysis of triple modular redundancy system with spare

    Get PDF
    Hardware redundant fault-tolerant systems and the different design approaches are discussed. The reliability analysis of fault-tolerant systems is usually done under permanent fault conditions. With statistical data suggesting that up to 90% of system failures are caused by intermittent faults, the reliability analysis of fault-tolerant systems must concentrate more on this class of faults. In this work, a reconfigurable Triple Modular Redundancy (TMR) with spare system that differentiates between permanent and intermittent faults has been built. The reconfiguration process of this system depends on both the current status of its modules and their history. Based on this, a different approach for reliability analysis under intermittent fault conditions using Markov models is presented. This approach shows a much higher system reliability compared to other redundant and non-redundant configurations

    Quafu-Qcover: Explore Combinatorial Optimization Problems on Cloud-based Quantum Computers

    Full text link
    We present Quafu-Qcover, an open-source cloud-based software package designed for combinatorial optimization problems that support both quantum simulators and hardware backends. Quafu-Qcover provides a standardized and complete workflow for solving combinatorial optimization problems using the Quantum Approximate Optimization Algorithm (QAOA). It enables the automatic modeling of the original problem as a quadratic unconstrained binary optimization (QUBO) model and corresponding Ising model, which can be further transformed into a weight graph. The core of Qcover relies on a graph decomposition-based classical algorithm, which enables obtaining the optimal parameters for the shallow QAOA circuit more efficiently. Quafu-Qcover includes a specialized compiler that translates QAOA circuits into physical quantum circuits capable of execution on Quafu cloud quantum computers. Compared to a general-purpose compiler, ours generates shorter circuit depths while also possessing better speed performance. The Qcover compiler can establish a library of qubits coupling substructures in real time based on the updated calibration data of the superconducting quantum devices, ensuring that the task is executed on physical qubits with higher fidelity. The Quafu-Qcover allows us to retrieve quantum computer sampling result information at any time using task ID, enabling asynchronous processing. Besides, it includes modules for result preprocessing and visualization, allowing for an intuitive display of the solution to combinatorial optimization problems. We hope that Quafu-Qcover can serve as a guiding example for how to explore application problems on the Quafu cloud quantum computersComment: Comments are welcome

    Using Fine Grain Approaches for highly reliable Design of FPGA-based Systems in Space

    Get PDF
    Nowadays using SRAM based FPGAs in space missions is increasingly considered due to their flexibility and reprogrammability. A challenge is the devices sensitivity to radiation effects that increased with modern architectures due to smaller CMOS structures. This work proposes fault tolerance methodologies, that are based on a fine grain view to modern reconfigurable architectures. The focus is on SEU mitigation challenges in SRAM based FPGAs which can result in crucial situations
    corecore