Fault localization in general refers to a technique for identifying
the likely root causes of failures observed in systems formed from
components. Fault localization in systems deployed on mobile ad hoc
networks (MANETs) is a particularly challenging task because those
systems are subject to a wider variety and higher incidence of faults
than those deployed in fixed networks, the resources available to
track fault symptoms are severely limited, and many of the sources of
faults in MANETs are by their nature transient.
We present a suite of three methods, each responsible for part of the
overall task of localizing the faults occurring in service-based
systems hosted on MANETs. First, we describe a dependence discovery
method, designed specifically for this environment, yielding dynamic
snapshots of dependence relationships discovered through decentralized
observations of service interactions. Next, we present a method for
localizing the faults occurring in service-based systems hosted on
MANETs. We employ both Bayesian and timing-based reasoning techniques
to analyze the dependence data produced by the dependence discovery
method in the context of a specific fault propagation model, deriving
a ranked list of candidate fault locations. In the third method, we
present an epidemic protocol designed for transferring the dependence
and symptom data between nodes of MANET networks with low
connectivity. The protocol creates network wide synchronization
overlay and transfers the data over intermediate nodes in periodic
synchronization cycles.
We introduce a new tool for simulation of service-based systems hosted
on MANETs and use the tool for evaluation of several operational
aspects of the methods. Next, we present implementation of the methods
in Java EE and use emulation environment to evaluate the methods. We
present the results of an extensive set of experiments exploring a
wide range of operational conditions to evaluate the accuracy and
performance of our methods.Open Acces