nodintsova @ hotmail.com In this work, we focus on cost-efficient techniques for realtime diagnosis in distributed systems that allow an adaptive, on-line selection and execution of appropriate measurements (tests). Particularly, one of our applications concerns fault diagnosis in distributed computer systems and networks by using test transactions, or probes (e.g., "traceroute " or "ping " commands). The key efficiency issues include both the cost of probing (e.g., the number of probes), and the computational complexity of diagnosis. In our past work (see (Rish, Brodie, & Ma 2002a)), we derived some theoretical conditions on the number of probes required for an asymptotic error-free diagnosis, and developed efficient search techniques for probe set selection that can greatly reduce the probe set size while maintaining its diagnostic capability (Brodie, Rish, & Ma 2001). Next, we considered the problem of real-time diagnosis as a probabilistic inference in Bayesian networks and investigated simple and efficient local approximation techniques, based on variable-elimination (the minibucket scheme (Dechter & Rish 2002)). Our empirical studies show that these approximations "degrade gracefully " with noise and often yield an optimal solution when noise is low enough, and our initial theoretical analysis explains this behavior for the simplest (greedy) approximation (Rish, Brodie, & Ma 2002a; 2002b). Our future work will focus on adapting more sophisticated approximation techniques, such as Generalized Belief Propagation (Yedidia, Freeman, & Weiss 2001), to real-time scenarios, and a real-time, incrementalearning of Dynamic Bayesian Networks based on the historic data and the feedback on the diagnosis results
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.