thesis

Fault localization in service-based systems hosted in mobile ad hoc networks

Abstract

Fault localization in general refers to a technique for identifying the likely root causes of failures observed in systems formed from components. Fault localization in systems deployed on mobile ad hoc networks (MANETs) is a particularly challenging task because those systems are subject to a wider variety and higher incidence of faults than those deployed in fixed networks, the resources available to track fault symptoms are severely limited, and many of the sources of faults in MANETs are by their nature transient. We present a suite of three methods, each responsible for part of the overall task of localizing the faults occurring in service-based systems hosted on MANETs. First, we describe a dependence discovery method, designed specifically for this environment, yielding dynamic snapshots of dependence relationships discovered through decentralized observations of service interactions. Next, we present a method for localizing the faults occurring in service-based systems hosted on MANETs. We employ both Bayesian and timing-based reasoning techniques to analyze the dependence data produced by the dependence discovery method in the context of a specific fault propagation model, deriving a ranked list of candidate fault locations. In the third method, we present an epidemic protocol designed for transferring the dependence and symptom data between nodes of MANET networks with low connectivity. The protocol creates network wide synchronization overlay and transfers the data over intermediate nodes in periodic synchronization cycles. We introduce a new tool for simulation of service-based systems hosted on MANETs and use the tool for evaluation of several operational aspects of the methods. Next, we present implementation of the methods in Java EE and use emulation environment to evaluate the methods. We present the results of an extensive set of experiments exploring a wide range of operational conditions to evaluate the accuracy and performance of our methods.Open Acces

    Similar works