55,501 research outputs found

    A distributed networked approach for fault detection of large-scale systems

    Get PDF
    Networked systems present some key new challenges in the development of fault diagnosis architectures. This paper proposes a novel distributed networked fault detection methodology for large-scale interconnected systems. The proposed formulation incorporates a synchronization methodology with a filtering approach in order to reduce the effect of measurement noise and time delays on the fault detection performance. The proposed approach allows the monitoring of multi-rate systems, where asynchronous and delayed measurements are available. This is achieved through the development of a virtual sensor scheme with a model-based re-synchronization algorithm and a delay compensation strategy for distributed fault diagnostic units. The monitoring architecture exploits an adaptive approximator with learning capabilities for handling uncertainties in the interconnection dynamics. A consensus-based estimator with timevarying weights is introduced, for improving fault detectability in the case of variables shared among more than one subsystem. Furthermore, time-varying threshold functions are designed to prevent false-positive alarms. Analytical fault detectability sufficient conditions are derived and extensive simulation results are presented to illustrate the effectiveness of the distributed fault detection technique

    A Distributed Networked Approach for Fault Detection of Large-scale Systems

    Get PDF
    Networked systems present some key new challenges in the development of fault diagnosis architectures. This paper proposes a novel distributed networked fault detection methodology for large-scale interconnected systems. The proposed formulation incorporates a synchronization methodology with a filtering approach in order to reduce the effect of measurement noise and time delays on the fault detection performance. The proposed approach allows the monitoring of multi-rate systems, where asynchronous and delayed measurements are available. This is achieved through the development of a virtual sensor scheme with a model-based re-synchronization algorithm and a delay compensation strategy for distributed fault diagnostic units. The monitoring architecture exploits an adaptive approximator with learning capabilities for handling uncertainties in the interconnection dynamics. A consensus-based estimator with timevarying weights is introduced, for improving fault detectability in the case of variables shared among more than one subsystem. Furthermore, time-varying threshold functions are designed to prevent false-positive alarms. Analytical fault detectability sufficient conditions are derived and extensive simulation results are presented to illustrate the effectiveness of the distributed fault detection technique

    Theories for Session-based Governance for Large-scale Distributed Systems

    Get PDF
    PhDLarge-scale distributed systems and distributed computing are the pillars of IT infrastructure and society nowadays. Robust theoretical principles for designing, building, managing and understanding the interactive behaviours of such systems need to be explored. A promising approach for establishing such principles is to view the session as the key unit for design, execution and verification. Governance is a general term for verifying whether activities meet the specified requirements and for enforcing safe behaviours among processes. This thesis, based on the asynchronous -calculus and the theory of session types, provides a monitoring framework and a theory for validating specifications, verifying mutual behaviours during runtime, and taking actions when noncompliant behaviours are detected. We explore properties and principles for governing large-scale distributed systems, in which autonomous and heterogeneous system components interact with each other in the network to accomplish application goals. This thesis, incorporating lessons from my participation in a substantial practical project, the Ocean Observatories Initiative (OOI), proposes an asynchronous monitoring framework and the process calculus for dynamically governing the asynchronous interactions among distributed multiple applications. We prove that this monitoring model guarantees the satisfaction of global assertions, and state and prove theorems of local and global safety, transparency, and session fidelity. We also study and introduce the semantic mechanisms for runtime session-based governance and the principles of validation of stateful specifications through capturing the runtime asynchronous interactions.EPSRC grants EP/G015481/1; Queen Mary University of Londo

    On Synchronous and Asynchronous Monitor Instrumentation for Actor-based systems

    Full text link
    We study the impact of synchronous and asynchronous monitoring instrumentation on runtime overheads in the context of a runtime verification framework for actor-based systems. We show that, in such a context, asynchronous monitoring incurs substantially lower overhead costs. We also show how, for certain properties that require synchronous monitoring, a hybrid approach can be used that ensures timely violation detections for the important events while, at the same time, incurring lower overhead costs that are closer to those of an asynchronous instrumentation.Comment: In Proceedings FOCLASA 2014, arXiv:1502.0315

    Decentralized Runtime Verification of LTL Specifications in Distributed Systems

    Get PDF
    Runtime verification is a lightweight automated formal method for specification-based run- time monitoring as well as testing of large real-world systems. While numerous techniques exist for runtime verification of sequential programs, there has been very little work on specification- based monitoring of distributed systems. In this work, we propose the first sound and complete method for runtime verification of asynchronous distributed programs for the 3-valued semantics of LTL specifications defined over the global state of the program. Our technique for evaluating LTL properties is inspired by distributed computation slicing, an approach for abstracting distributed computations with respect to a given predicate. Our monitoring technique is fully decentralized in that each process in the distributed program under inspection maintains a replica of the monitor automaton. Each monitor may maintain a set of possible verification verdicts based upon existence of concurrent events. Our experiments on runtime monitoring of a set of iOS devices running a distributed program show that due to the design of our Algorithm, monitoring overhead grows only in the linear order of the number of processes and events that need to be monitored

    Monitoring Partially Synchronous Distributed Systems using SMT Solvers

    Full text link
    In this paper, we discuss the feasibility of monitoring partially synchronous distributed systems to detect latent bugs, i.e., errors caused by concurrency and race conditions among concurrent processes. We present a monitoring framework where we model both system constraints and latent bugs as Satisfiability Modulo Theories (SMT) formulas, and we detect the presence of latent bugs using an SMT solver. We demonstrate the feasibility of our framework using both synthetic applications where latent bugs occur at any time with random probability and an application involving exclusive access to a shared resource with a subtle timing bug. We illustrate how the time required for verification is affected by parameters such as communication frequency, latency, and clock skew. Our results show that our framework can be used for real-life applications, and because our framework uses SMT solvers, the range of appropriate applications will increase as these solvers become more efficient over time.Comment: Technical Report corresponding to the paper accepted at Runtime Verification (RV) 201
    corecore