536,445 research outputs found

    Monitoring distributed object and component communication

    Get PDF
    This thesis presents our work in the area of monitoring distributed software applications (DSAs). We produce three main results: (1) a design approach for building monitoring systems, (2) a design of a system for MOnitoring Distributed Object and Component Communication (MODOCC) behavior in middleware-based applications, and (3) a proof-of-concept implementation of this system

    Monitoring Partially Synchronous Distributed Systems using SMT Solvers

    Full text link
    In this paper, we discuss the feasibility of monitoring partially synchronous distributed systems to detect latent bugs, i.e., errors caused by concurrency and race conditions among concurrent processes. We present a monitoring framework where we model both system constraints and latent bugs as Satisfiability Modulo Theories (SMT) formulas, and we detect the presence of latent bugs using an SMT solver. We demonstrate the feasibility of our framework using both synthetic applications where latent bugs occur at any time with random probability and an application involving exclusive access to a shared resource with a subtle timing bug. We illustrate how the time required for verification is affected by parameters such as communication frequency, latency, and clock skew. Our results show that our framework can be used for real-life applications, and because our framework uses SMT solvers, the range of appropriate applications will increase as these solvers become more efficient over time.Comment: Technical Report corresponding to the paper accepted at Runtime Verification (RV) 201

    Tools for monitoring and controlling distributed applications

    Get PDF
    The Meta system is a UNIX-based toolkit that assists in the construction of reliable reactive systems, such as distributed monitoring and debugging systems, tool integration systems and reliable distributed applications. Meta provides mechanisms for instrumenting a distributed application and the environment in which it executes, and Meta supplies a service that can be used to monitor and control such an instrumented application. The Meta toolkit is built on top of the ISIS toolkit; they can be used together in order to build fault-tolerant and adaptive, distributed applications

    A distributed networked approach for fault detection of large-scale systems

    Get PDF
    Networked systems present some key new challenges in the development of fault diagnosis architectures. This paper proposes a novel distributed networked fault detection methodology for large-scale interconnected systems. The proposed formulation incorporates a synchronization methodology with a filtering approach in order to reduce the effect of measurement noise and time delays on the fault detection performance. The proposed approach allows the monitoring of multi-rate systems, where asynchronous and delayed measurements are available. This is achieved through the development of a virtual sensor scheme with a model-based re-synchronization algorithm and a delay compensation strategy for distributed fault diagnostic units. The monitoring architecture exploits an adaptive approximator with learning capabilities for handling uncertainties in the interconnection dynamics. A consensus-based estimator with timevarying weights is introduced, for improving fault detectability in the case of variables shared among more than one subsystem. Furthermore, time-varying threshold functions are designed to prevent false-positive alarms. Analytical fault detectability sufficient conditions are derived and extensive simulation results are presented to illustrate the effectiveness of the distributed fault detection technique
    • …
    corecore