25,693 research outputs found

    An On-line BIST RAM Architecture with Self Repair Capabilities

    Get PDF
    The emerging field of self-repair computing is expected to have a major impact on deployable systems for space missions and defense applications, where high reliability, availability, and serviceability are needed. In this context, RAM (random access memories) are among the most critical components. This paper proposes a built-in self-repair (BISR) approach for RAM cores. The proposed design, introducing minimal and technology-dependent overheads, can detect and repair a wide range of memory faults including: stuck-at, coupling, and address faults. The test and repair capabilities are used on-line, and are completely transparent to the external user, who can use the memory without any change in the memory-access protocol. Using a fault-injection environment that can emulate the occurrence of faults inside the module, the effectiveness of the proposed architecture in terms of both fault detection and repairing capability was verified. Memories of various sizes have been considered to evaluate the area-overhead introduced by this proposed architectur

    Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

    Full text link
    Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

    Fault-tolerant computer study

    Get PDF
    A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

    Design methods for fault-tolerant navigation computers

    Get PDF
    Design methods for fault tolerant navigation computer

    Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

    Get PDF
    The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts

    Fault-tolerance techniques for hybrid CMOS/nanoarchitecture

    Get PDF
    The authors propose two fault-tolerance techniques for hybrid CMOS/nanoarchitecture implementing logic functions as look-up tables. The authors compare the efficiency of the proposed techniques with recently reported methods that use single coding schemes in tolerating high fault rates in nanoscale fabrics. Both proposed techniques are based on error correcting codes to tackle different fault rates. In the first technique, the authors implement a combined two-dimensional coding scheme using Hamming and Bose-Chaudhuri-Hocquenghem (BCH) codes to address fault rates greater than 5. In the second technique, Hamming coding is complemented with bad line exclusion technique to tolerate fault rates higher than the first proposed technique (up to 20). The authors have also estimated the improvement that can be achieved in the circuit reliability in the presence of Don-t Care Conditions. The area, latency and energy costs of the proposed techniques were also estimated in the CMOS domain

    Efficient Built In Self Repair Strategy for Embedded SRAM with selecteble redundancy

    Get PDF
    Built-in self -test (BIST) refers to those testing techniques where additional hardware is added to a design so that testing is accomplished without the aid of external hardware. Usually, a pseudo-random generator is used to apply test vectors to the circuit under test and a data compactor is used to produce a signature. To increase the reliability and yield of embedded memories, many redundancy mechanisms have been proposed. All the redundancy mechanisms bring penalty of area and complexity to embedded memories design. Considered that compiler is used to configure SRAM for different needs, the BISR had better bring no change to other modules in SRAM. To solve the problem, a new redundancy scheme is proposed in this paper. Some normal words in embedded memories can be selected as redundancy instead of adding spare words, spare rows, spare columns or spare blocks. Built-In Self-Repair (BISR) with Redundancy is an effective yield-enhancement strategy for embedded memories. This paper proposes an efficient BISR strategy which consists of a Built-In Self-Test (BIST) module, a Built-In Address-Analysis (BIAA) module and a Multiplexer (MUX) module. The BISR is designed flexible that it can provide four operation modes to SRAM users. Each fault address can be saved only once is the feature of the proposed BISR strategy. In BIAA module, fault addresses and redundant ones form a one- to- one mapping to achieve a high repair speed. Besides, instead of adding spare words, rows, columns or blocks in the SRAMs, users can select normal words as redundancy

    A Review paper on the Memory Built-In Self-Repair with Redundancy Logic

    Full text link
    The Present review paper expresses the word oriented memory test methodology for Built-In Self-Repair (BISR). To replace the defect words few logics are introduced. These logics are memory BIST logic and Wrapper logic. Whenever a test is carries on, the defected words are pointed out by its address only and these addresses are called failing address. The failing addresses are stored in the fuse box. Using fuse box it avoids the classic redundancy concept, where the RAMS has spare rows and columns. After the detection of faulty address, they are stored in redundancy logic. During test and redundancy configuration, the fuse box is connected to a scan registernbsp by this processnbsp inputnbsp and output data can be evaluated

    Fault diagnostic instrumentation design for environmental control and life support systems

    Get PDF
    As a development phase moves toward flight hardware, the system availability becomes an important design aspect which requires high reliability and maintainability. As part of continous development efforts, a program to evaluate, design, and demonstrate advanced instrumentation fault diagnostics was successfully completed. Fault tolerance designs for reliability and other instrumenation capabilities to increase maintainability were evaluated and studied
    corecore