351 research outputs found

    One Net Fits All: A unifying semantics of Dynamic Fault Trees using GSPNs

    Get PDF
    Dynamic Fault Trees (DFTs) are a prominent model in reliability engineering. They are strictly more expressive than static fault trees, but this comes at a price: their interpretation is non-trivial and leaves quite some freedom. This paper presents a GSPN semantics for DFTs. This semantics is rather simple and compositional. The key feature is that this GSPN semantics unifies all existing DFT semantics from the literature. All semantic variants can be obtained by choosing appropriate priorities and treatment of non-determinism.Comment: Accepted at Petri Nets 201

    Dependability analysis of a safety critical system: the LHC Beam Dumping System at CERN

    Get PDF
    Il sistema di estrazione del fascio del nuovo acceleratore LHC del CERN (LHC Beam Dumping System, LBDS) ha il compito di rimuovere il fascio di particelle dall’anello in caso di anomalie, guasti nella macchina o al termine di una operazione. Il sistema rappresenta uno dei componenti critici per la sicurezza dell’acceleratore LHC. Il suo malfunzionamento puo’ portare alla mancata o parziale estrazione del fascio che, per le elevatissime energie raggiunte (7 TeV), ha la capacita’ di distruggere i magneti superconduttori dell’acceleratore e determinare l’arresto delle operazioni per un lungo periodo. La tesi affronta lo studio della sicurezza del sistema di estrazione del fascio di particelle ed il suo impatto sulla vita operativa del sistema in termini di numero aborto missioni(failsafe modes). Un modello dinamico ad eventi discreti stocastico del processo di guasto del sistema e’ stato ricavato partendo da una accurata analisi della sua architettura, dei modi e delle statistiche di guasto di ciascun componente. Il modello e’ stato analizzato rispetto a diversi scenari operativi, fornendo le stime della sicurezza e del numero aborto missioni per un anno di operazioni. L’analisi ha anche valutato l’efficacia delle soluzioni architetturali che sono state adottate per tollerare e prevenire il guasto nei componenti piu’ critici. I risultati ottenuti hanno dimostrato che il sistema rispetta i requisiti SIL3 dello standard di sicurezza IEC 61508, e non interferisce oltre misura sul normale funzionamento della macchina. Lo studio include anche una valutazione della sicurezza complessiva ottenuta per mezzo del sistema di protezione di cui il sistema LBDS e’ parte integrante

    Safe Architectural Design Principles

    Get PDF
    This report discusses architectures for safety-critical systems. The report summarises the existing literature in the area as well as the guidance provided by existing safety-critical system development standards. We discuss the three constituent functions of fault tolerant architectures: error detection, damage assessment and confinement and error recovery. We also consider methods for fault prevention

    Reliability modelling of dispensing processes in community pharmacy

    Get PDF
    Studies of error rates in community pharmacies have reported error rates of between 0.014% and 3.3% per item dispensed. This suggests up to 36 million items per year may contain errors in England. In addition, literature shows that patient satisfaction with services is directly related to waiting times. There is a need for a method to model pharmacy efficiency balancing safety and waiting times, ensuring that the reliability of the dispensing process is not compromised. In this paper a Coloured Petri Net (CPN) approach is proposed for analysing reliability and efficiency of community pharmacy. A pharmacy team work to complete dispensing and non-dispensing tasks, where non-dispensing tasks require staff to be temporarily removed from the dispensing process. The proposed approach is useful to investigate what affects the error rates and long waiting times, and provides modelling-based evidence to decision makers, looking to optimise staffing levels and improve the reliability of dispensing

    Reliability and Safety Modeling of a Digital Feed Water Control System

    Get PDF
    Much digital instrumentation and control systems embedded in the critical medical healthcare equipment aerospace devices and nuclear industry have obvious consequence of different failure modes. These failures can affect the behavior of the overall safety critical digital system and its ability to deliver its dependability attributes if any defected area that could be a hardware component or software code embedded inside the digital system is not detected and repaired appropriately. The safety and reliability analysis of safety critical systems can be accomplished with Markov modeling techniques which could express the dynamic and regenerative behavior of the digital control system. Certain states in the system represent system failure while others represent fault free behavior or correct operation in the presence of faults. This paper presents the development of a safety and reliability modeling of a digital feedwater control system using Markov based chain models. All the Markov states and the transitions between these states were assumed and calculated from the control logic for the digital control system. Finally based on the simulation results of modeling the digital feedwater control system the system does meet its reliability requirement with the probability of being in fully operational states is 0.99 over a 6 months time.Comment: 13 pages, 7 figures, conferenc

    Intelligent Hardware-Enabled Sensor and Software Safety and Health Management for Autonomous UAS

    Get PDF
    Unmanned Aerial Systems (UAS) can only be deployed if they can effectively complete their mission and respond to failures and uncertain environmental conditions while maintaining safety with respect to other aircraft as well as humans and property on the ground. We propose to design a real-time, onboard system health management (SHM) capability to continuously monitor essential system components such as sensors, software, and hardware systems for detection and diagnosis of failures and violations of safety or performance rules during the ight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and software signals; (2) signal analysis, preprocessing, and advanced on-the- y temporal and Bayesian probabilistic fault diagnosis; (3) an unobtrusive, lightweight, read-only, low-power hardware realization using Field Programmable Gate Arrays (FPGAs) in order to avoid overburdening limited computing resources or costly re-certi cation of ight software due to instrumentation. No currently available SHM capabilities (or combinations of currently existing SHM capabilities) come anywhere close to satisfying these three criteria yet NASA will require such intelligent, hardwareenabled sensor and software safety and health management for introducing autonomous UAS into the National Airspace System (NAS). We propose a novel approach of creating modular building blocks for combining responsive runtime monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. Our proposed research program includes both developing this novel approach and demonstrating its capabilities using the NASA Swift UAS as a demonstration platform

    The Design of Fail-Safe Logic

    Get PDF
    This paper examines the behavior of digital logic families, specifically identifying the properties and characteristics of digital fail-safe logic. Fail-safe digital design is examined utilizing classical logic and semiconductor theory. The effects of failures internal to the structure of digital integrated circuits are analyzed and a discussion of pertinent logic design is presented. The techniques to detect all types of multiple failure modes are examined. With these results, a method of design for fail-safe logic is presented and analyzed

    Emerging trends proceedings of the 17th International Conference on Theorem Proving in Higher Order Logics: TPHOLs 2004

    Get PDF
    technical reportThis volume constitutes the proceedings of the Emerging Trends track of the 17th International Conference on Theorem Proving in Higher Order Logics (TPHOLs 2004) held September 14-17, 2004 in Park City, Utah, USA. The TPHOLs conference covers all aspects of theorem proving in higher order logics as well as related topics in theorem proving and verification. There were 42 papers submitted to TPHOLs 2004 in the full research cate- gory, each of which was refereed by at least 3 reviewers selected by the program committee. Of these submissions, 21 were accepted for presentation at the con- ference and publication in volume 3223 of Springer?s Lecture Notes in Computer Science series. In keeping with longstanding tradition, TPHOLs 2004 also offered a venue for the presentation of work in progress, where researchers invite discussion by means of a brief introductory talk and then discuss their work at a poster session. The work-in-progress papers are held in this volume, which is published as a 2004 technical report of the School of Computing at the University of Utah

    Extension of the behavior composition framework in presence of failures using recovery techniques and AKKA

    Get PDF
    Abstract: Fault tolerance is an essential property to be satis ed in the composition of services, but reaching a high level of fault tolerance remains a challenge. In the area of ubiquitous computing, the composition of services is inevitable when a request cannot be carried out by a single service, but by a combination of several services. This thesis studies fault tolerance in the context of a general behavior composition framework. This approach raises, rst, the problem of the synthesis of controllers (or compositions) in order to coordinate a set of available services to achieve a new service, the target service and, second, the exploitation of all compositions to make the new service fault tolerant. Although a solution has been proposed by the authors of the behavior composition framework, it is incomplete and has not been evaluated experimentally or in situ. This thesis brings two contributions to this problem. On one hand, it considers the case in which the service selected by the controller is temporarily or permanently unavailable by exploiting recovery techniques to identify a consistent state of the system from which it may progress using other services or leave it in a coherent state when none of the available services no longer allows progression. On the other hand, it evaluates several recovery solutions, each useful in services malfunction situations, using a case study implemented with the aid of Akka, a tool that facilitates the development of reactive, concurrent and distributed systems.La tolérance aux fautes est une propriété indispensable à satisfaire dans la composition de services, mais atteindre un haut de niveau de tolérance aux fautes représente un défi majeur. Dans l'ère de l'informatique ubiquitaire, la composition de services est inévitable lorsqu'une requête ne peut être réalisée par un seul service, mais par la combinaison de plusieurs services. Ce mémoire étudie la tolérance aux fautes dans le contexte d'un cadre général de composition de comportements (behavior composition framework en anglais). Cette approche soulève, tout d'abord, le problème de la synthèse de contrôleurs (ou compositions) de façon à coordonner un ensemble de services disponibles afin de réaliser un nouveau service, le service cible et, ensuite, celui de l'exploitation de l'ensemble des compositions afin de rendre le nouveau service tolérant aux fautes. Bien qu'une solution ait été proposée par les auteurs de ce cadre de composition, elle est incomplète et elle n'a pas été évaluée expérimentalement ou in situ. Ce mémoire apporte deux contributions à ce problème. D'une part, il considère le cas dans lequel le service visé par le contrôleur est temporairement ou définitivement non disponible en exploitant des techniques de reprise afin d'identifier un état cohérent du système à partir duquel il peut progresser en utilisant d'autres services ou de le laisser dans un état cohérent lorsqu'aucun service, parmi ceux disponibles, ne permet plus de progression. D'autre part, il évalue plusieurs solutions de reprise, chacune utile dans des situations particulières de pannes, à l'aide d'une étude de cas implémentée en Akka, un outil qui permet aisément de mettre en oeuvre des systèmes réactifs, concurrents et répartis
    • …
    corecore