22,414 research outputs found

    Fault-tolerance techniques for hybrid CMOS/nanoarchitecture

    Get PDF
    The authors propose two fault-tolerance techniques for hybrid CMOS/nanoarchitecture implementing logic functions as look-up tables. The authors compare the efficiency of the proposed techniques with recently reported methods that use single coding schemes in tolerating high fault rates in nanoscale fabrics. Both proposed techniques are based on error correcting codes to tackle different fault rates. In the first technique, the authors implement a combined two-dimensional coding scheme using Hamming and Bose-Chaudhuri-Hocquenghem (BCH) codes to address fault rates greater than 5. In the second technique, Hamming coding is complemented with bad line exclusion technique to tolerate fault rates higher than the first proposed technique (up to 20). The authors have also estimated the improvement that can be achieved in the circuit reliability in the presence of Don-t Care Conditions. The area, latency and energy costs of the proposed techniques were also estimated in the CMOS domain

    Engineering failure analysis and design optimisation with HiP-HOPS

    Get PDF
    The scale and complexity of computer-based safety critical systems, like those used in the transport and manufacturing industries, pose significant challenges for failure analysis. Over the last decade, research has focused on automating this task. In one approach, predictive models of system failure are constructed from the topology of the system and local component failure models using a process of composition. An alternative approach employs model-checking of state automata to study the effects of failure and verify system safety properties. In this paper, we discuss these two approaches to failure analysis. We then focus on Hierarchically Performed Hazard Origin & Propagation Studies (HiP-HOPS) - one of the more advanced compositional approaches - and discuss its capabilities for automatic synthesis of fault trees, combinatorial Failure Modes and Effects Analyses, and reliability versus cost optimisation of systems via application of automatic model transformations. We summarise these contributions and demonstrate the application of HiP-HOPS on a simplified fuel oil system for a ship engine. In light of this example, we discuss strengths and limitations of the method in relation to other state-of-the-art techniques. In particular, because HiP-HOPS is deductive in nature, relating system failures back to their causes, it is less prone to combinatorial explosion and can more readily be iterated. For this reason, it enables exhaustive assessment of combinations of failures and design optimisation using computationally expensive meta-heuristics. (C) 2010 Elsevier Ltd. All rights reserved

    On the diagnostic emulation technique and its use in the AIRLAB

    Get PDF
    An aid is presented for understanding and judging the relevance of the diagnostic emulation technique to studies of highly reliable, digital computing systems for aircraft. A short review is presented of the need for and the use of the technique as well as an explanation of its principles of operation and implementation. Details that would be needed for operational control or modification of existing versions of the technique are not described

    CBR and MBR techniques: review for an application in the emergencies domain

    Get PDF
    The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version

    Considerations for a design and operations knowledge support system for Space Station Freedom

    Get PDF
    Engineering and operations of modern engineered systems depend critically upon detailed design and operations knowledge that is accurate and authoritative. A design and operations knowledge support system (DOKSS) is a modern computer-based information system providing knowledge about the creation, evolution, and growth of an engineered system. The purpose of a DOKSS is to provide convenient and effective access to this multifaceted information. The complexity of Space Station Freedom's (SSF's) systems, elements, interfaces, and organizations makes convenient access to design knowledge especially important, when compared to simpler systems. The life cycle length, being 30 or more years, adds a new dimension to space operations, maintenance, and evolution. Provided here is a review and discussion of design knowledge support systems to be delivered and operated as a critical part of the engineered system. A concept of a DOKSS for Space Station Freedom (SSF) is presented. This is followed by a detailed discussion of a DOKSS for the Lyndon B. Johnson Space Center and Work Package-2 portions of SSF

    Toward the assessment of the susceptibility of a digital system to lightning upset

    Get PDF
    Accomplishments and directions for further research aimed at developing methods for assessing a candidate design of an avionic computer with respect to susceptability to lightning upset are reported. Emphasis is on fault tolerant computers. Both lightning stress and shielding are covered in a review of the electromagnetic environment. Stress characterization, system characterization, upset detection, and positive and negative design features are considered. A first cut theory of comparing candidate designs is presented including tests of comparative susceptability as well as its analysis and simulation. An approach to lightning induced transient fault effects is included

    Quantum Computing: Pro and Con

    Get PDF
    I assess the potential of quantum computation. Broad and important applications must be found to justify construction of a quantum computer; I review some of the known quantum algorithms and consider the prospects for finding new ones. Quantum computers are notoriously susceptible to making errors; I discuss recently developed fault-tolerant procedures that enable a quantum computer with noisy gates to perform reliably. Quantum computing hardware is still in its infancy; I comment on the specifications that should be met by future hardware. Over the past few years, work on quantum computation has erected a new classification of computational complexity, has generated profound insights into the nature of decoherence, and has stimulated the formulation of new techniques in high-precision experimental physics. A broad interdisciplinary effort will be needed if quantum computers are to fulfill their destiny as the world's fastest computing devices. (This paper is an expanded version of remarks that were prepared for a panel discussion at the ITP Conference on Quantum Coherence and Decoherence, 17 December 1996.)Comment: 17 pages, LaTeX, submitted to Proc. Roy. Soc. Lond. A, minor correction

    Overhead and noise threshold of fault-tolerant quantum error correction

    Full text link
    Fault tolerant quantum error correction (QEC) networks are studied by a combination of numerical and approximate analytical treatments. The probability of failure of the recovery operation is calculated for a variety of CSS codes, including large block codes and concatenated codes. Recent insights into the syndrome extraction process, which render the whole process more efficient and more noise-tolerant, are incorporated. The average number of recoveries which can be completed without failure is thus estimated as a function of various parameters. The main parameters are the gate (gamma) and memory (epsilon) failure rates, the physical scale-up of the computer size, and the time t_m required for measurements and classical processing. The achievable computation size is given as a surface in parameter space. This indicates the noise threshold as well as other information. It is found that concatenated codes based on the [[23,1,7]] Golay code give higher thresholds than those based on the [[7,1,3]] Hamming code under most conditions. The threshold gate noise gamma_0 is a function of epsilon/gamma and t_m; example values are {epsilon/gamma, t_m, gamma_0} = {1, 1, 0.001}, {0.01, 1, 0.003}, {1, 100, 0.0001}, {0.01, 100, 0.002}, assuming zero cost for information transport. This represents an order of magnitude increase in tolerated memory noise, compared with previous calculations, which is made possible by recent insights into the fault-tolerant QEC process.Comment: 21 pages, 12 figures, minor mistakes corrected and layout improved, ref added; v4: clarification of assumption re logic gate

    Fault-tolerant sub-lithographic design with rollback recovery

    Get PDF
    Shrinking feature sizes and energy levels coupled with high clock rates and decreasing node capacitance lead us into a regime where transient errors in logic cannot be ignored. Consequently, several recent studies have focused on feed-forward spatial redundancy techniques to combat these high transient fault rates. To complement these studies, we analyze fine-grained rollback techniques and show that they can offer lower spatial redundancy factors with no significant impact on system performance for fault rates up to one fault per device per ten million cycles of operation (Pf = 10^-7) in systems with 10^12 susceptible devices. Further, we concretely demonstrate these claims on nanowire-based programmable logic arrays. Despite expensive rollback buffers and general-purpose, conservative analysis, we show the area overhead factor of our technique is roughly an order of magnitude lower than a gate level feed-forward redundancy scheme
    corecore