848,256 research outputs found

    Cross-layer system reliability assessment framework for hardware faults

    Get PDF
    System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.Peer ReviewedPostprint (author's final draft

    Model reduction by trimming for a class of semi-Markov reliability models and the corresponding error bound

    Get PDF
    Semi-Markov processes have proved to be an effective and convenient tool to construct models of systems that achieve reliability by redundancy and reconfiguration. These models are able to depict complex system architectures and to capture the dynamics of fault arrival and system recovery. A disadvantage of this approach is that the models can be extremely large, which poses both a model and a computational problem. Techniques are needed to reduce the model size. Because these systems are used in critical applications where failure can be expensive, there must be an analytically derived bound for the error produced by the model reduction technique. A model reduction technique called trimming is presented that can be applied to a popular class of systems. Automatic model generation programs were written to help the reliability analyst produce models of complex systems. This method, trimming, is easy to implement and the error bound easy to compute. Hence, the method lends itself to inclusion in an automatic model generator

    CARE 3 phase 2 report - mathematical description

    Get PDF
    CARE III (Computer-Aided Reliability Estimation, version three) a computer program designed to help estimate the reliability of complex, redundant systems is described. Although the program can model a wide variety of redundant structures, it was developed specifically for fault tolerant avionics systems. CARE III generalizes the class of system structures that can be modeled and greatly expands the coverage model to take into account such effects as intermittent and transient faults, latent faults, and error propagation

    The CARE 3 Phase 3 Report: Test and Evaluation

    Get PDF
    CARE 3 (Computer-Aided Reliability Estimation, version three) is a computer program designed to help estimate the reliability of complex, redundant systems; although the program can model a wide variety of redundant structures, it was developed specifically for fault-tolerant avionics systems, systems distinguished by the need for extremely reliable performance since a system failure could well result in the loss of human life. CARE 3 further generalizes the class of system structures that can be modeled and greatly expands the coverage model to take into account such effects as intermittent and transient faults, latent faults, error propagation, etc. The initial test and evaluation of CARE 3 are reported

    Trading reliability targets within a supply chain using Shapley's value

    Get PDF
    The development of complex systems involves a multi-tier supply chain, with each organisation allocated a reliability target for their sub-system or component part apportioned from system requirements. Agreements about targets are made early in the system lifecycle when considerable uncertainty exists about the design detail and potential failure modes. Hence resources required to achieve reliability are unpredictable. Some types of contracts provide incentives for organisations to negotiate targets so that system reliability requirements are met, but at minimum cost to the supply chain. This paper proposes a mechanism for deriving a fair price for trading reliability targets between suppliers using information gained about potential failure modes through development and the costs of activities required to generate such information. The approach is based upon Shapley's value and is illustrated through examples for a particular reliability growth model, and associated empirical cost model, developed for problems motivated by the aerospace industry. The paper aims to demonstrate the feasibility of the method and discuss how it could be extended to other reliability allocation models

    Survival Signature-based Reliability Approach for Complex Systems Susceptible to Common Cause Failures

    Get PDF
    The importance of reliability to complex systems cannot be disputed as they are the backbones of our society. In practice, the common cause failures may have severe reverse function on complex systems’ overall stability. Survival Signature opens a new way to perform reliability analysis on systems with multiple component types. This paper under takes a research on survival signature-based reliability analysis on complex systems susceptible to Common Cause Failures. To be specific, it proposes the standard α-factor model and general α-factor model to combine with the survival signature. In practical applications, the α-factor estimator of the system might not be defined completely due to limited data, or knowledge which requires to take imprecision into account. Some numerical cases are presented to show the applicability of the methods for complex systems. In addition, this paper may attract people’s attention on the conception of Design for Reliability

    Care 3 phase 2 report, maintenance manual

    Get PDF
    CARE 3 (Computer-Aided Reliability Estimation, version three) is a computer program designed to help estimate the reliability of complex, redundant systems. Although the program can model a wide variety of redundant structures, it was developed specifically for fault-tolerant avionics systems--systems distinguished by the need for extremely reliable performance since a system failure could well result in the loss of human life. It substantially generalizes the class of redundant configurations that could be accommodated, and includes a coverage model to determine the various coverage probabilities as a function of the applicable fault recovery mechanisms (detection delay, diagnostic scheduling interval, isolation and recovery delay, etc.). CARE 3 further generalizes the class of system structures that can be modeled and greatly expands the coverage model to take into account such effects as intermittent and transient faults, latent faults, error propagation, etc

    Reliability and Condition-Based Maintenance Analysis of Deteriorating Systems Subject to Generalized Mixed Shock Model

    Get PDF
    For successful commercialization of evolving devices (e.g., micro-electro-mechanical systems, and biomedical devices), there must be new research focusing on reliability models and analysis tools that can assist manufacturing and maintenance of these devices. These advanced systems may experience multiple failure processes that compete against each other. Two major failure processes are identified to be deteriorating or degradation processes (e.g., wear, fatigue, erosion, corrosion) and random shocks. When these failure processes are dependent, it is a challenging problem to predict reliability of complex systems. This research aims to develop reliability models by exploring new aspects of dependency between competing risks of degradation-based and shock-based failure considering a generalized mixed shock model, and to develop new and effective condition-based maintenance policies based on the developed reliability models. In this research, different aspects of dependency are explored to accurately estimate the reliability of complex systems. When the degradation rate is accelerated as a result of withstanding a particular shock pattern, we develop reliability models with a changing degradation rate for four different shock patterns. When the hard failure threshold reduces due to changes in degradation, we investigate reliability models considering the dependence of the hard failure threshold on the degradation level for two different scenarios. More generally, when the degradation rate and the hard failure threshold can simultaneously transition multiple times, we propose a rich reliability model for a new generalized mixed shock model that is a combination of extreme shock model, δ-shock model and run shock model. This general assumption reflects complex behaviors associated with modern systems and structures that experience multiple sources of external shocks. Based on the developed reliability models, we introduce new condition-based maintenance strategies by including various maintenance actions (e.g., corrective replacement, preventive replacement, and imperfect repair) to minimize the expected long-run average maintenance cost rate. The decisions for maintenance actions are made based on the health condition of systems that can be observed through periodic inspection. The reliability and maintenance models developed in this research can provide timely and effective tools for decision-makers in manufacturing to economically optimize operational decisions for improving reliability, quality and productivity.Industrial Engineering, Department o
    • …
    corecore