6 research outputs found

    Beyond the golden run : evaluating the use of reference run models in fault injection analysis

    Get PDF
    Fault injection (FI) has been shown to be an effective approach to assess- ing the dependability of software systems. To determine the impact of faults injected during FI, a given oracle is needed. This oracle can take a variety of forms, however prominent oracles include (i) specifications, (ii) error detection mechanisms and (iii) golden runs. Focusing on golden runs, in this paper we show that there are classes of software which a golden run based approach can not be used to analyse. Specifically we demonstrate that a golden run based approach can not be used when analysing systems which employ a main control loop with an irregular period. Further, we show how a simple model, which has been refined using FI, can be employed as an oracle in the analysis of such a system

    Enhancing Failure Propagation Analysis in Cloud Computing Systems

    Full text link
    In order to plan for failure recovery, the designers of cloud systems need to understand how their system can potentially fail. Unfortunately, analyzing the failure behavior of such systems can be very difficult and time-consuming, due to the large volume of events, non-determinism, and reuse of third-party components. To address these issues, we propose a novel approach that joins fault injection with anomaly detection to identify the symptoms of failures. We evaluated the proposed approach in the context of the OpenStack cloud computing platform. We show that our model can significantly improve the accuracy of failure analysis in terms of false positives and negatives, with a low computational cost.Comment: 12 pages, The 30th International Symposium on Software Reliability Engineering (ISSRE 2019

    Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

    Full text link
    Cloud computing systems fail in complex and unexpected ways due to unexpected combinations of events and interactions between hardware and software components. Fault injection is an effective means to bring out these failures in a controlled environment. However, fault injection experiments produce massive amounts of data, and manually analyzing these data is inefficient and error-prone, as the analyst can miss severe failure modes that are yet unknown. This paper introduces a new paradigm (fault injection analytics) that applies unsupervised machine learning on execution traces of the injected system, to ease the discovery and interpretation of failure modes. We evaluated the proposed approach in the context of fault injection experiments on the OpenStack cloud computing platform, where we show that the approach can accurately identify failure modes with a low computational cost.Comment: IEEE Transactions on Dependable and Secure Computing; 16 pages. arXiv admin note: text overlap with arXiv:1908.1164

    Towards the design of efficient error detection mechanisms

    Get PDF
    The pervasive nature of modern computer systems has led to an increase in our reliance on such systems to provide correct and timely services. Moreover, as the functionality of computer systems is being increasingly defined in software, it is imperative that software be dependable. It has previously been shown that a fault intolerant software system can be made fault tolerant through the design and deployment of software mechanisms implementing abstract artefacts known as error detection mechanisms (EDMs) and error recovery mechanisms (ERMs), hence the design of these components is central to the design of dependable software systems. The EDM design problem, which relates to the construction of a boolean predicate over a set of program variables, is inherently difficult, with current approaches relying on system specifications and the experience of software engineers. As this process necessarily entails the identification and incorporation of program variables by an error detection predicate, this thesis seeks to address the EDM design problem from a novel variable-centric perspective, with the research presented supporting the thesis that, where it exists under the assumed system model, an efficient EDM consists of a set of critical variables. In particular, this research proposes (i) a metric suite that can be used to generate a relative ranking of the program variables in a software with respect to their criticality, (ii) a systematic approach for the generation of highly-efficient error detection predicates for EDMs, and (iii) an approach for dependability enhancement based on the protection of critical variables using software wrappers that implement error detection and correction predicates that are known to be efficient. This research substantiates the thesis that an efficient EDM contains a set of critical variables on the basis that (i) the proposed metric suite is able, through application of an appropriate threshold, to identify critical variables, (ii) efficient EDMs can be constructed based only on the critical variables identified by the metric suite, and (iii) the criticality of the identified variables can be shown to extend across a software module such that an efficient EDM designed for that software module should seek to determine the correctness of the identified variables

    From Safety Analysis to Experimental Validation by Fault Injection—Case of Automotive Embedded Systems

    Get PDF
    En raison de la complexité croissante des systèmes automobiles embarqués, la sûreté de fonctionnement est devenue un enjeu majeur de l’industrie automobile. Cet intérêt croissant s’est traduit par la sortie en 2011 de la norme ISO 26262 sur la sécurité fonctionnelle. Les défis auxquelles sont confrontés les acteurs du domaine sont donc les suivants : d’une part, la conception de systèmes sûrs, et d’autre part, la conformité aux exigences de la norme ISO 26262. Notre approche se base sur l’application systématique de l’injection de fautes pour la vérification et la validation des exigences de sécurité, tout au long du cycle de développement, des phases de conception jusqu’à l’implémentation. L’injection de fautes nous permet en particulier de vérifier que les mécanismes de tolérance aux fautes sont efficaces et que les exigences non-fonctionnelles sont respectées. L’injection de faute est une technique de vérification très ancienne. Cependant, son rôle lors de la phase de conception et ses complémentarités avec la validation expérimentale, méritent d’être étudiés. Notre approche s’appuie sur l’application du modèle FARM (Fautes, Activations, Relevés et Mesures) tout au long du processus de développement. Les analyses de sûreté sont le point de départ de notre approche, avec l'identification des mécanismes de tolérance aux fautes et des exigences non-fonctionnelles, et se terminent par la validation de ces mécanismes par les expériences classiques d'injection de fautes. Enfin, nous montrons que notre approche peut être intégrée dans le processus de développement des systèmes embarqués automobiles décrits dans la norme ISO 26262. Les contributions de la thèse sont illustrées sur l’étude de cas d’un système d’éclairage avant d’une automobile. ABSTRACT : Due to the rising complexity of automotive Electric/Electronic embedded systems, Functional Safety becomes a main issue in the automotive industry. This issue has been formalized by the introduction of the ISO 26262 standard for functional safety in 2011. The challenges are, on the one hand to design safe systems based on a systematic verification and validation approach, and on the other hand, the fulfilment of the requirements of the ISO 26262 standard. Following ISO 26262 recommendations, our approach, based on fault injection, aims at verifying fault tolerance mechanisms and non-functional requirements at all steps of the development cycle, from early design phases down to implementation. Fault injection is a verification technique that has been investigated for a long time. However, the role of fault injection during design phase and its complementarities with the experimental validation of the target have not been explored. In this work, we investigate a fault injection continuum, from system design validation to experiments on implemented targets. The proposed approach considers the safety analyses as a starting point, with the identification of safety mechanisms and safety requirements, and goes down to the validation of the implementation of safety mechanisms through fault injection experiments. The whole approach is based on a key fault injection framework, called FARM (Fault, Activation, Readouts and Measures). We show that this approach can be integrated in the development process of the automotive embedded systems described in the ISO 26262 standard. Our approach is illustrated on an automotive case study: a Front-Light system

    Evaluating the use of reference run models in fault injection analysis

    Get PDF
    Fault injection (FI) has been shown to be an effective approach to assessing the dependability of software systems. To determine the impact of faults injected during FI, a given oracle is needed. Oracles can take a variety of forms, including (i) specifications, (ii) error detection mechanisms and (iii) golden runs. Focusing on golden runs, in this paper we show that there are classes of software which a golden run based approach can not be used to analyse. Specifically, we demonstrate that a golden run based approach can not be used in the analysis of systems which employ a main control loop with an irregular period. Further, we show how a simple model, which has been refined using FI experiments, can be employed as an oracle in the analysis of such a system
    corecore