3,561 research outputs found

    An On-line BIST RAM Architecture with Self Repair Capabilities

    Get PDF
    The emerging field of self-repair computing is expected to have a major impact on deployable systems for space missions and defense applications, where high reliability, availability, and serviceability are needed. In this context, RAM (random access memories) are among the most critical components. This paper proposes a built-in self-repair (BISR) approach for RAM cores. The proposed design, introducing minimal and technology-dependent overheads, can detect and repair a wide range of memory faults including: stuck-at, coupling, and address faults. The test and repair capabilities are used on-line, and are completely transparent to the external user, who can use the memory without any change in the memory-access protocol. Using a fault-injection environment that can emulate the occurrence of faults inside the module, the effectiveness of the proposed architecture in terms of both fault detection and repairing capability was verified. Memories of various sizes have been considered to evaluate the area-overhead introduced by this proposed architectur

    Yield Enhancement of Digital Microfluidics-Based Biochips Using Space Redundancy and Local Reconfiguration

    Full text link
    As microfluidics-based biochips become more complex, manufacturing yield will have significant influence on production volume and product cost. We propose an interstitial redundancy approach to enhance the yield of biochips that are based on droplet-based microfluidics. In this design method, spare cells are placed in the interstitial sites within the microfluidic array, and they replace neighboring faulty cells via local reconfiguration. The proposed design method is evaluated using a set of concurrent real-life bioassays.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

    Fault-tolerant multichannel demultiplexer subsystems

    Get PDF
    Fault tolerance in future processing and switching communication satellites is addressed by showing new methods for detecting hardware failures in the first major subsystem, the multichannel demultiplexer. An efficient method for demultiplexing frequency slotted channels uses multirate filter banks which contain fast Fourier transform processing. All numerical processing is performed at a lower rate commensurate with the small bandwidth of each bandbase channel. The integrity of the demultiplexing operations is protected by using real number convolutional codes to compute comparable parity values which detect errors at the data sample level. High rate, systematic convolutional codes produce parity values at a much reduced rate, and protection is achieved by generating parity values in two ways and comparing them. Parity values corresponding to each output channel are generated in parallel by a subsystem, operating even slower and in parallel with the demultiplexer that is virtually identical to the original structure. These parity calculations may be time shared with the same processing resources because they are so similar

    Periodic Application of Concurrent Error Detection in Processor Array Architectures

    Get PDF
    Processor arrays can provide an attractive architecture for some applications. Featuring modularity, regular interconnection and high parallelism, such arrays are well-suited for VLSI/WSI implementations, and applications with high computational requirements, such as real-time signal processing. Preserving the integrity of results can be of paramount importance for certain applications. In these cases, fault tolerance should be used to ensure reliable delivery of a system's service. One aspect of fault tolerance is the detection of errors caused by faults. Concurrent error detection (CED) techniques offer the advantage that transient and intermittent faults may be detected with greater probability than with off-line diagnostic tests. Applying time-redundant CED techniques can reduce hardware redundancy costs. However, most time-redundant CED techniques degrade a system's performance

    Analysis and design of algorithm-based fault-tolerant systems

    Get PDF
    An important consideration in the design of high performance multiprocessor systems is to ensure the correctness of the results computed in the presence of transient and intermittent failures. Concurrent error detection and correction have been applied to such systems in order to achieve reliability. Algorithm Based Fault Tolerance (ABFT) was suggested as a cost-effective concurrent error detection scheme. The research was motivated by the complexity involved in the analysis and design of ABFT systems. To that end, a matrix-based model was developed and, based on that, algorithms for both the design and analysis of ABFT systems are formulated. These algorithms are less complex than the existing ones. In order to reduce the complexity further, a hierarchical approach is developed for the analysis of large systems

    Redundancy management for efficient fault recovery in NASA's distributed computing system

    Get PDF
    The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance

    Design of a fault tolerant airborne digital computer. Volume 1: Architecture

    Get PDF
    This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

    Techniques for the realization of ultra- reliable spaceborne computer Final report

    Get PDF
    Bibliography and new techniques for use of error correction and redundancy to improve reliability of spaceborne computer
    • …
    corecore