116,644 research outputs found

    A compositional method for reliability analysis of workflows affected by multiple failure modes

    Get PDF
    We focus on reliability analysis for systems designed as workflow based compositions of components. Components are characterized by their failure profiles, which take into account possible multiple failure modes. A compositional calculus is provided to evaluate the failure profile of a composite system, given failure profiles of the components. The calculus is described as a syntax-driven procedure that synthesizes a workflows failure profile. The method is viewed as a design-time aid that can help software engineers reason about systems reliability in the early stage of development. A simple case study is presented to illustrate the proposed approach

    Cross-layer system reliability assessment framework for hardware faults

    Get PDF
    System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.Peer ReviewedPostprint (author's final draft

    The impact of diversity upon common mode failures

    Get PDF
    Recent models for the failure behaviour of systems involving redundancy and diversity have shown that common mode failures can be accounted for in terms of the variability of the failure probability of components over operational environments. Whenever such variability is present, we can expect that the overall system reliability will be less than we could have expected if the components could have been assumed to fail independently. We generalise a model of hardware redundancy due to Hughes [Hughes 1987], and show that with forced diversity, this unwelcome result no longer applies: in fact it becomes theoretically possible to do better than would be the case under independence of failures

    Design diversity: an update from research on reliability modelling

    Get PDF
    Diversity between redundant subsystems is, in various forms, a common design approach for improving system dependability. Its value in the case of software-based systems is still controversial. This paper gives an overview of reliability modelling work we carried out in recent projects on design diversity, presented in the context of previous knowledge and practice. These results provide additional insight for decisions in applying diversity and in assessing diverseredundant systems. A general observation is that, just as diversity is a very general design approach, the models of diversity can help conceptual understanding of a range of different situations. We summarise results in the general modelling of common-mode failure, in inference from observed failure data, and in decision-making for diversity in development.

    Engineering failure analysis and design optimisation with HiP-HOPS

    Get PDF
    The scale and complexity of computer-based safety critical systems, like those used in the transport and manufacturing industries, pose significant challenges for failure analysis. Over the last decade, research has focused on automating this task. In one approach, predictive models of system failure are constructed from the topology of the system and local component failure models using a process of composition. An alternative approach employs model-checking of state automata to study the effects of failure and verify system safety properties. In this paper, we discuss these two approaches to failure analysis. We then focus on Hierarchically Performed Hazard Origin & Propagation Studies (HiP-HOPS) - one of the more advanced compositional approaches - and discuss its capabilities for automatic synthesis of fault trees, combinatorial Failure Modes and Effects Analyses, and reliability versus cost optimisation of systems via application of automatic model transformations. We summarise these contributions and demonstrate the application of HiP-HOPS on a simplified fuel oil system for a ship engine. In light of this example, we discuss strengths and limitations of the method in relation to other state-of-the-art techniques. In particular, because HiP-HOPS is deductive in nature, relating system failures back to their causes, it is less prone to combinatorial explosion and can more readily be iterated. For this reason, it enables exhaustive assessment of combinations of failures and design optimisation using computationally expensive meta-heuristics. (C) 2010 Elsevier Ltd. All rights reserved

    Assessing the reliability of adaptive power system protection schemes

    Get PDF
    Adaptive power system protection can be used to improve the performance of existing protection schemes under certain network conditions. However, their deployment in the field is impeded by their perceived inferior reliability compared to existing protection arrangements. Moreover, their validation can be problematic due to the perceived high likelihood of the occurrence of failure modes or incorrect setting selection with variable network conditions. Reliability (including risk assessment) is one of the decisive measures that can be used in the process of verifying adaptive protection scheme performance. This paper proposes a generic methodology for assessing the reliability of adaptive protection. The method involves the identification of initiating events and scenarios that lead to protection failures and quantification of the probability of the occurrence of each failure. A numerical example of the methodology for an adaptive distance protection scheme is provided

    A synthesis of logic and bio-inspired techniques in the design of dependable systems

    Get PDF
    Much of the development of model-based design and dependability analysis in the design of dependable systems, including software intensive systems, can be attributed to the application of advances in formal logic and its application to fault forecasting and verification of systems. In parallel, work on bio-inspired technologies has shown potential for the evolutionary design of engineering systems via automated exploration of potentially large design spaces. We have not yet seen the emergence of a design paradigm that effectively combines these two techniques, schematically founded on the two pillars of formal logic and biology, from the early stages of, and throughout, the design lifecycle. Such a design paradigm would apply these techniques synergistically and systematically to enable optimal refinement of new designs which can be driven effectively by dependability requirements. The paper sketches such a model-centric paradigm for the design of dependable systems, presented in the scope of the HiP-HOPS tool and technique, that brings these technologies together to realise their combined potential benefits. The paper begins by identifying current challenges in model-based safety assessment and then overviews the use of meta-heuristics at various stages of the design lifecycle covering topics that span from allocation of dependability requirements, through dependability analysis, to multi-objective optimisation of system architectures and maintenance schedules

    Introducing the STAMP method in road tunnel safety assessment

    Get PDF
    After the tremendous accidents in European road tunnels over the past decade, many risk assessment methods have been proposed worldwide, most of them based on Quantitative Risk Assessment (QRA). Although QRAs are helpful to address physical aspects and facilities of tunnels, current approaches in the road tunnel field have limitations to model organizational aspects, software behavior and the adaptation of the tunnel system over time. This paper reviews the aforementioned limitations and highlights the need to enhance the safety assessment process of these critical infrastructures with a complementary approach that links the organizational factors to the operational and technical issues, analyze software behavior and models the dynamics of the tunnel system. To achieve this objective, this paper examines the scope for introducing a safety assessment method which is based on the systems thinking paradigm and draws upon the STAMP model. The method proposed is demonstrated through a case study of a tunnel ventilation system and the results show that it has the potential to identify scenarios that encompass both the technical system and the organizational structure. However, since the method does not provide quantitative estimations of risk, it is recommended to be used as a complementary approach to the traditional risk assessments rather than as an alternative. (C) 2012 Elsevier Ltd. All rights reserved
    • …
    corecore