35,566 research outputs found

    Redundant Logic Insertion and Fault Tolerance Improvement in Combinational Circuits

    Full text link
    This paper presents a novel method to identify and insert redundant logic into a combinational circuit to improve its fault tolerance without having to replicate the entire circuit as is the case with conventional redundancy techniques. In this context, it is discussed how to estimate the fault masking capability of a combinational circuit using the truth-cum-fault enumeration table, and then it is shown how to identify the logic that can introduced to add redundancy into the original circuit without affecting its native functionality and with the aim of improving its fault tolerance though this would involve some trade-off in the design metrics. However, care should be taken while introducing redundant logic since redundant logic insertion may give rise to new internal nodes and faults on those may impact the fault tolerance of the resulting circuit. The combinational circuit that is considered and its redundant counterparts are all implemented in semi-custom design style using a 32/28nm CMOS digital cell library and their respective design metrics and fault tolerances are compared

    Algorithmic Based Fault Tolerance Applied to High Performance Computing

    Full text link
    We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance technique (Huang and Abraham, 1984) to the need of parallel distributed computation. We obtain a strongly scalable mechanism for fault tolerance. We can also detect and correct errors (bit-flip) on the fly of a computation. To assess the viability of our approach, we have developed a fault tolerant matrix-matrix multiplication subroutine and we propose some models to predict its running time. Our parallel fault-tolerant matrix-matrix multiplication scores 1.4 TFLOPS on 484 processors (cluster jacquard.nersc.gov) and returns a correct result while one process failure has happened. This represents 65% of the machine peak efficiency and less than 12% overhead with respect to the fastest failure-free implementation. We predict (and have observed) that, as we increase the processor count, the overhead of the fault tolerance drops significantly

    Model Prediction-Based Approach to Fault Tolerant Control with Applications

    Get PDF
    Abstract— Fault-tolerant control (FTC) is an integral component in industrial processes as it enables the system to continue robust operation under some conditions. In this paper, an FTC scheme is proposed for interconnected systems within an integrated design framework to yield a timely monitoring and detection of fault and reconfiguring the controller according to those faults. The unscented Kalman filter (UKF)-based fault detection and diagnosis system is initially run on the main plant and parameter estimation is being done for the local faults. This critical information\ud is shared through information fusion to the main system where the whole system is being decentralized using the overlapping decomposition technique. Using this parameter estimates of decentralized subsystems, a model predictive control (MPC) adjusts its parameters according to the\ud fault scenarios thereby striving to maintain the stability of the system. Experimental results on interconnected continuous time stirred tank reactors (CSTR) with recycle and quadruple tank system indicate that the proposed method is capable to correctly identify various faults, and then controlling the system under some conditions

    On the reliability of electrical drives for safety-critical applications

    Get PDF
    The aim of this work is to present some issues related to fault tolerant electric drives,which are able to overcome different types of faults occurring in the sensors, in thepower converter and in the electrical machine, without compromising the overallfunctionality of the system. These features are of utmost importance in safety-criticalapplications. In this paper, the reliability of both commercial and innovative driveconfigurations, which use redundant hardware and suitable control algorithms, will beinvestigated for the most common types of fault: besides standard three phase motordrives, also multiphase topologies, open-end winding solutions, multi-machineconfigurations will be analyzed, applied to various electric motor technologies. Thecomplexity of hardware and control strategies will also be compared in this paper, sincethis has a tremendous impact on the investment costs

    On quantifying fault patterns of the mesh interconnect networks

    Get PDF
    One of the key issues in the design of Multiprocessors System-on-Chip (MP-SoCs), multicomputers, and peerto- peer networks is the development of an efficient communication network to provide high throughput and low latency and its ability to survive beyond the failure of individual components. Generally, the faulty components may be coalesced into fault regions, which are classified into convex and concave shapes. In this paper, we propose a mathematical solution for counting the number of common fault patterns in a 2-D mesh interconnect network including both convex (|-shape, | |-shape, ý-shape) and concave (L-shape, Ushape, T-shape, +-shape, H-shape) regions. The results presented in this paper which have been validated through simulation experiments can play a key role when studying, particularly, the performance analysis of fault-tolerant routing algorithms and measure of a network fault-tolerance expressed as the probability of a disconnection
    corecore