1 research outputs found
Using Abduction in Markov Logic Networks for Root Cause Analysis
IT infrastructure is a crucial part in most of today's business operations.
High availability and reliability, and short response times to outages are
essential. Thus a high amount of tool support and automation in risk management
is desirable to decrease outages. We propose a new approach for calculating the
root cause for an observed failure in an IT infrastructure. Our approach is
based on Abduction in Markov Logic Networks. Abduction aims to find an
explanation for a given observation in the light of some background knowledge.
In failure diagnosis, the explanation corresponds to the root cause, the
observation to the failure of a component, and the background knowledge to the
dependency graph extended by potential risks. We apply a method to extend a
Markov Logic Network in order to conduct abductive reasoning, which is not
naturally supported in this formalism. Our approach exhibits a high amount of
reusability and enables users without specific knowledge of a concrete
infrastructure to gain viable insights in the case of an incident. We
implemented the method in a tool and illustrate its suitability for root cause
analysis by applying it to a sample scenario