12 research outputs found
THEORETICAL AND PRACTICAL ASPECTS OF DECISION SUPPORT SYSTEMS BASED ON THE PRINCIPLES OF QUERY-BASED DIAGNOSTICS
Diagnosis has been traditionally one of the most successful applications of Bayesian networks. The main bottleneck in applying Bayesian networks to diagnostic problems seems to be model building, which is typically a complex and time consuming task.
Query-based diagnostics offers passive, incremental construction of diagnostic models that rest on the interaction between a diagnostician and a computer-based diagnostic system. Every case, passively observed by the system, adds information and, in the long run, leads to construction of a usable model. This approach minimizes knowledge engineering in model building.
This dissertation focuses on theoretical and practical aspects of building systems based on the idea of query-based diagnostics. Its main contributions are an investigation of the optimal approach to learning parameters of Bayesian networks from continuous data streams, dealing with structural complexity in building Bayesian networks through removal of the weakest arcs, and a practical evaluation of the idea of query-based diagnostics. One of the main problems of query-based diagnostic systems is dealing with complexity. As data comes
in, the models constructed may become too large and too densely connected. I address this problem in two ways. First, I present an empirical comparison of Bayesian network
parameter learning algorithms. This study provides the optimal solutions for the system when dealing with continuous data streams. Second, I conduct a series of experiments testing control of the growth of a model by means of removing its weakest arcs. The results show that removing up to 20 percent of the weakest arcs in a network has minimal effect on its classification accuracy, and reduces the amount of memory taken by the clique tree and by this the amount of computation needed to perform inference. An empirical evaluation of query-based diagnostic systems shows that the diagnostic accuracy reaches reasonable levels after merely tens of cases and continues to increase with the number of cases, comparing favorably to state of the art approaches based on learning
Recommended from our members
Efficient Variational Inference for Hierarchical Models of Images, Text, and Networks
Variational inference provides a general optimization framework to approximate the posterior distributions of latent variables in probabilistic models. Although effective in simple scenarios, variational inference may be inaccurate or infeasible when the data is high-dimensional, the model structure is complicated, or variable relationships are non-conjugate. We propose solutions to these problems through the smart design and leverage of model structures, the rigorous derivation of variational bounds, and the creation of flexible algorithms for various models with rich, non-conjugate dependencies.Concretely, we first design an interpretable generative model for natural images, in which the hundreds of thousands of pixels per image are split into small patches represented by Gaussian mixture models. Through structured variational inference, the evidence lower bound of this model automatically recovers the popular expected patch log-likelihood method for image processing. A nonparametric extension using hierarchical Dirichlet processes further enables self-similarities to be captured and image-specific clusters created during inference, boosting image denoising and inpainting accuracy.Then we move on to text data, and design hierarchical topic graphs that generalize the bipartite noisy-OR models previously used for medical diagnosis. We derive auxiliary bounds to overcome the non-conjugacy of noisy-OR conditionals, and use stochastic variational inference to efficiently train on datasets with hundreds of thousands of documents. We dramatically increase the algorithm speed through a constrained family of variational bounds, so that only the ancestors of the sparse observed tokens of each document need to be considered.Finally, we propose a general-purpose Monte Carlo variational inference strategy that is directly applicable to any model with discrete variables. Compared to REINFORCE-style stochastic gradient updates, our coordinate-ascent updates have lower variance and converge much faster. Compared to auxiliary-variable bounds crafted for each individual model, our algorithm is simpler to derive and may be easily integrated into probabilistic programming languages for broader use. By avoiding auxiliary variables, we also tighten likelihood bounds and increase robustness to local optima. Extensive experiments on real-world models of images, text, and networks illustrate these appealing advantages
Recommended from our members
Optimal inference with local expressions
Probabilistic inference using Bayesian networks is now a well-established approach for reasoning under uncertainty. Among many e ciency-driven tech- niques which have been developed, the Optimal Factoring Problem (OFP) is distinguished for presenting a combinatorial optimization point of view on the problem. The contribution of this thesis is to extend OFP into a theoretical frame- work that not only covers the standard Bayesian networks but also includes non-standard Bayesian networks. A non-standard Bayesian network has struc- tures within its local distributions that are signi cant to the problem. This thesis presents value sets algebra as a coherent framework that facilitates formal treatments of inference in both standard and non-standard Bayesian networks as a combinatorial optimization problem. Parallel to value sets algebra theory local expression languages allow one to symbolically encode Bayesian network distributions. Such symbolic encod- ings allow all the structural and numerical information in distributions to be represented in the most compact form. However, the symbolic and syntactic exibilities in local expression languages have the usual drawback of allow- ing possible incoherent expressions. Value sets algebra leads us to an e cient coherency veri cation on such expressions. This thesis views optimal inference with local expressions as an optimal search problem. The search space for this problem is shown to be so large that it renders any exhaustive search impractical. Hence it is necessary to turn to heuristic solutions. Using A* heuristic framework and ideas from OFP, which is the counterpart of this problem for standard Bayesian networks, a heuris- tic algorithm for the problem is developed. As a key feature, this algorithm di erentiates between symbolic combinations of expressions and arithmetic op- erations in the expressions. Cost bearing arithmetic operations are performed only when su cient information is available to guarantee that no saving oppor- tunities are lost. On the other hand, expressions are combined in a way that quickly provides maximum opportunity for e cient arithmetic operations. This thesis also explores the representation of Intercausal Independencies (ICI) in Bayesian networks and de nes some new operators in local expression language which are shown to facilitate more e cient ICI representations
Exploiting Structure in Backtracking Algorithms for Propositional and Probabilistic Reasoning
Boolean propositional satisfiability (SAT) and probabilistic reasoning represent
two core problems in AI. Backtracking based algorithms have been applied in both
problems. In this thesis, I investigate structure-based techniques for solving real world
SAT and Bayesian networks, such as software testing and medical diagnosis instances.
When solving a SAT instance using backtracking search, a sequence of decisions
must be made as to which variable to branch on or instantiate next. Real world problems
are often amenable to a divide-and-conquer strategy where the original instance
is decomposed into independent sub-problems. Existing decomposition techniques
are based on pre-processing the static structure of the original problem. I propose
a dynamic decomposition method based on hypergraph separators. Integrating this
dynamic separator decomposition into the variable ordering of a modern SAT solver
leads to speedups on large real world SAT problems.
Encoding a Bayesian network into a CNF formula and then performing weighted
model counting is an effective method for exact probabilistic inference. I present two
encodings for improving this approach with noisy-OR and noisy-MAX relations. In
our experiments, our new encodings are more space efficient and can speed up the
previous best approaches over two orders of magnitude.
The ability to solve similar problems incrementally is critical for many probabilistic
reasoning problems. My aim is to exploit the similarity of these instances by
forwarding structural knowledge learned during the analysis of one instance to the
next instance in the sequence. I propose dynamic model counting and extend the dynamic
decomposition and caching technique to multiple runs on a series of problems
with similar structure. This allows us to perform Bayesian inference incrementally as
the evidence, parameter, and structure of the network change. Experimental results
show that my approach yields significant improvements over previous model counting
approaches on multiple challenging Bayesian network instances
Efficient Probabilistic Inference Algorithms for Cooperative Multiagent Systems
Probabilistic reasoning methods, Bayesian networks (BNs) in particular, have emerged as an effective and central tool for reasoning under uncertainty. In a multi-agent environment, agents equipped with local knowledge often need to collaborate and reason about a larger uncertainty domain. Multiply sectioned Bayesian networks (MSBNs) provide a solution for the probabilistic reasoning of cooperative agents in such a setting. In this thesis, we first aim to improve the efficiency of current MSBN exact inference algorithms. We show that by exploiting the calculation schema and the semantic meaning of inter-agent messages, we can significantly reduce an agent\u27s local computational cost as well as the inter-agent communication overhead. Our novel technical contributions include 1) a new message passing architecture based on an MSBN linked junction tree forest (LJF); 2) a suite of algorithms extended from our work in BNs to provide the semantic analysis of inter-agent messages; 3) a fast marginal calibration algorithm, designed for an LJF that guarantees exact results with a minimum local and global cost. We then investigate how to incorporate approximation techniques in the MSBN framework. We present a novel local adaptive importance sampler (LLAIS) designed to apply localized stochastic sampling while maintaining the LJF structure. The LLAIS sampler provides accurate estimations for local posterior beliefs and promotes efficient calculation of inter-agent messages. We also address the problem of online monitoring for cooperative agents. As the MSBN model is restricted to static domains, we introduce an MA-DBN model based on a combination of the MSBN and dynamic Bayesian network (DBN) models. We show that effective multi-agent online monitoring with bounded error is possible in an MA-DBN through a new secondary inference structure and a factorized representation of forward messages
Local Probability Distributions in Bayesian Networks: Knowledge Elicitation and Inference
Bayesian networks (BNs) have proven to be a modeling framework capable of capturing uncertain knowledge and have been applied successfully in many domains for over 25 years. The strength of Bayesian networks lies in the graceful combination of probability theory and a graphical structure representing probabilistic dependencies among domain variables in a compact manner that is intuitive for humans. One major challenge related to building practical BN models is specification of conditional probability distributions. The number of probability distributions in a conditional probability table for a given variable is exponential in its number of parent nodes, so that defining them becomes problematic or even impossible from a practical standpoint. The objective of this dissertation is to develop a better understanding of models for compact representations of local probability distributions. The hypothesis is that such models should allow for building larger models more efficiently and lead to a wider range of BN applications
Surprise: An Alternative Qualitative Uncertainty Model
This dissertation embodies a study of the concept of surprise as a base for constructing qualitative calculi for representing and reasoning about uncertain knowledge. Two functions are presented, kappa++} and z, which construct qualitative ranks for events by obtaining the order of magnitude abstraction of the degree of surprise associated with them. The functions use natural numbers to classify events based their associated surprise and aim at providing a ranking that improves those provided by existing ranking functions. This in turn enables the use of such functions in an a la carte probabilistic system where one can choose the level of detail required to represent uncertain knowledge depending on the requirements of the application. The proposed ranking functions are defined along with surprise-update models associated with them. The reasoning mechanisms associated with the functions are developed mathematically and graphically. The advantages and expected limitations of both functions are compared with respect to each other and with existing ranking functions in the context of a bioinformatics application known as \u27\u27reverse engineering of genetic regulatory networks\u27\u27 in which the relations among various genetic components are discovered through the examination of a large amount of collected data. The ranking functions are examined in this context via graphical models which are exclusively developed or this purpose and which utilize the developed functions to represent uncertain knowledge at various levels of details