Search CORE

6,092 research outputs found

Fault-tolerant computer study

Author: Avizienis A. A.
Ercegovac M. D.
Rennels D. A.
Publication venue
Publication date
Field of study

A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

NASA Technical Reports Server

Layout level design for testability strategy applied to a CMOS cell library

Author: Blom F.C.
Ferrer C.
Oliver J.
Rullan M.
Publication venue: IEEE
Publication date: 01/01/1993
Field of study

The layout level design for testability (LLDFT) rules used here allow to avoid some hard to detect faults or even undetectable faults on a cell library by modifying the cell layout without changing their behavior and achieving a good level of reliability. These rules avoid some open faults or reduce their appearance probability. The main purpose has been to apply that set of LLDFT rules on the cells of the library designed at the Centre Nacional de Microelectronica (CNM) in order to obtain a highly testable cell library. The authors summarize the main results (area overhead and performance degradation) of the application of the LLDFT rules on the cell

University of Twente Research Information

Fault-tolerant building-block computer study

Author: Rennels D. A.
Publication venue
Publication date
Field of study

Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation

NASA Technical Reports Server

Combined Time and Information Redundancy for SEU-Tolerance in Energy-Efficient Real-Time Systems

Author: Al-Hashimi Bashir M.
Ejlali Ali
Miremadi Seyed G.
Rosinger Paul
Schmitz Marcus
Publication venue
Publication date: 01/04/2006
Field of study

Recently the trade-off between energy consumption and fault-tolerance in real-time systems has been highlighted. These works have focused on dynamic voltage scaling (DVS) to reduce dynamic energy dissipation and on time redundancy to achieve transient-fault tolerance. While the time redundancy technique exploits the available slack time to increase the fault-tolerance by performing recovery executions, DVS exploits slack time to save energy. Therefore we believe there is a resource conflict between the time-redundancy technique and DVS. The first aim of this paper is to propose the usage of information redundancy to solve this problem. We demonstrate through analytical and experimental studies that it is possible to achieve both higher transient fault-tolerance (tolerance to single event upsets (SEU)) and less energy using a combination of information and time redundancy when compared with using time redundancy alone. The second aim of this paper is to analyze the interplay of transient-fault tolerance (SEU-tolerance) and adaptive body biasing (ABB) used to reduce static leakage energy, which has not been addressed in previous studies. We show that the same technique (i.e. the combination of time and information redundancy) is applicable to ABB-enabled systems and provides more advantages than time redundancy alone

Southampton (e-Prints Soton)

Redundant Logic Insertion and Fault Tolerance Improvement in Combinational Circuits

Author: Balasubramanian P
Naayagi R T
Publication venue
Publication date: 21/07/2017
Field of study

This paper presents a novel method to identify and insert redundant logic into a combinational circuit to improve its fault tolerance without having to replicate the entire circuit as is the case with conventional redundancy techniques. In this context, it is discussed how to estimate the fault masking capability of a combinational circuit using the truth-cum-fault enumeration table, and then it is shown how to identify the logic that can introduced to add redundancy into the original circuit without affecting its native functionality and with the aim of improving its fault tolerance though this would involve some trade-off in the design metrics. However, care should be taken while introducing redundant logic since redundant logic insertion may give rise to new internal nodes and faults on those may impact the fault tolerance of the resulting circuit. The combinational circuit that is considered and its redundant counterparts are all implemented in semi-custom design style using a 32/28nm CMOS digital cell library and their respective design metrics and fault tolerances are compared

arXiv.org e-Print Archive

Crossref

Automated Synthesis of SEU Tolerant Architectures from OO Descriptions

Author: Chiusano Silvia Anna
Di Carlo Stefano
Prinetto Paolo Ernesto
Publication venue: IEEE Computer Society
Publication date: 01/01/2002
Field of study

SEU faults are a well-known problem in aerospace environment but recently their relevance grew up also at ground level in commodity applications coupled, in this frame, with strong economic constraints in terms of costs reduction. On the other hand, latest hardware description languages and synthesis tools allow reducing the boundary between software and hardware domains making the high-level descriptions of hardware components very similar to software programs. Moving from these considerations, the present paper analyses the possibility of reusing Software Implemented Hardware Fault Tolerance (SIHFT) techniques, typically exploited in micro-processor based systems, to design SEU tolerant architectures. The main characteristics of SIHFT techniques have been examined as well as how they have to be modified to be compatible with the synthesis flow. A complete environment is provided to automate the design instrumentation using the proposed techniques, and to perform fault injection experiments both at behavioural and gate level. Preliminary results presented in this paper show the effectiveness of the approach in terms of reliability improvement and reduced design effort

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Yield Enhancement of Digital Microfluidics-Based Biochips Using Space Redundancy and Local Reconfiguration

Author: Chakrabarty Krishnendu
Pamula Vamsee K.
Su Fei
Publication venue
Publication date: 25/10/2007
Field of study

As microfluidics-based biochips become more complex, manufacturing yield will have significant influence on production volume and product cost. We propose an interstitial redundancy approach to enhance the yield of biochips that are based on droplet-based microfluidics. In this design method, spare cells are placed in the interstitial sites within the microfluidic array, and they replace neighboring faulty cells via local reconfiguration. The proposed design method is evaluated using a set of concurrent real-life bioassays.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

arXiv.org e-Print Archive

CiteSeerX

Multilevel Clustering Fault Model for IC Manufacture

Author: Bogdanov Yu. I.
Bogdanova N. A.
Rudnev A. V.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 02/10/2003
Field of study

A hierarchical approach to the construction of compound distributions for process-induced faults in IC manufacture is proposed. Within this framework, the negative binomial distribution is treated as level-1 models. The hierarchical approach to fault distribution offers an integrated picture of how fault density varies from region to region within a wafer, from wafer to wafer within a batch, and so on. A theory of compound-distribution hierarchies is developed by means of generating functions. A study of correlations, which naturally appears in microelectronics due to the batch character of IC manufacture, is proposed. Taking these correlations into account is of significant importance for developing procedures for statistical quality control in IC manufacture. With respect to applications, hierarchies of yield means and yield probability-density functions are considered.Comment: 10 pages, the International Conference "Micro- and Nanoelectronics- 2003" (ICMNE-2003),Zvenigorod, Moscow district, Russia, October 6-10, 200

arXiv.org e-Print Archive

Crossref

Fault Secure Encoder and Decoder for NanoMemory Applications

Author: DeHon André
Naeimi Helia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Memory cells have been protected from soft errors for more than a decade; due to the increase in soft error rate in logic circuits, the encoder and decoder circuitry around the memory blocks have become susceptible to soft errors as well and must also be protected. We introduce a new approach to design fault-secure encoder and decoder circuitry for memory designs. The key novel contribution of this paper is identifying and defining a new class of error-correcting codes whose redundancy makes the design of fault-secure detectors (FSD) particularly simple. We further quantify the importance of protecting encoder and decoder circuitry against transient errors, illustrating a scenario where the system failure rate (FIT) is dominated by the failure rate of the encoder and decoder. We prove that Euclidean geometry low-density parity-check (EG-LDPC) codes have the fault-secure detector capability. Using some of the smaller EG-LDPC codes, we can tolerate bit or nanowire defect rates of 10% and fault rates of 10^(-18) upsets/device/cycle, achieving a FIT rate at or below one for the entire memory system and a memory density of 10^(11) bit/cm^2 with nanowire pitch of 10 nm for memory blocks of 10 Mb or larger. Larger EG-LDPC codes can achieve even higher reliability and lower area overhead

Caltech Authors