9 research outputs found

    Complexity Results and Approximation Strategies for MAP Explanations

    Full text link
    MAP is the problem of finding a most probable instantiation of a set of variables given evidence. MAP has always been perceived to be significantly harder than the related problems of computing the probability of a variable instantiation Pr, or the problem of computing the most probable explanation (MPE). This paper investigates the complexity of MAP in Bayesian networks. Specifically, we show that MAP is complete for NP^PP and provide further negative complexity results for algorithms based on variable elimination. We also show that MAP remains hard even when MPE and Pr become easy. For example, we show that MAP is NP-complete when the networks are restricted to polytrees, and even then can not be effectively approximated. Given the difficulty of computing MAP exactly, and the difficulty of approximating MAP while providing useful guarantees on the resulting approximation, we investigate best effort approximations. We introduce a generic MAP approximation framework. We provide two instantiations of the framework; one for networks which are amenable to exact inference Pr, and one for networks for which even exact inference is too hard. This allows MAP approximation on networks that are too complex to even exactly solve the easier problems, Pr and MPE. Experimental results indicate that using these approximation algorithms provides much better solutions than standard techniques, and provide accurate MAP estimates in many cases

    A Bayesian Abduction Model For Sensemaking

    Get PDF
    This research develops a Bayesian Abduction Model for Sensemaking Support (BAMSS) for information fusion in sensemaking tasks. Two methods are investigated. The first is the classical Bayesian information fusion with belief updating (using Bayesian clustering algorithm) and abductive inference. The second method uses a Genetic Algorithm (BAMSS-GA) to search for the k-best most probable explanation (MPE) in the network. Using various data from recent Iraq and Afghanistan conflicts, experimental simulations were conducted to compare the methods using posterior probability values which can be used to give insightful information for prospective sensemaking. The inference results demonstrate the utility of BAMSS as a computational model for sensemaking. The major results obtained are: (1) The inference results from BAMSS-GA gave average posterior probabilities that were 103 better than those produced by BAMSS; (2) BAMSS-GA gave more consistent posterior probabilities as measured by variances; and (3) BAMSS was able to give an MPE while BAMSS-GA was able to identify the optimal values for kMPEs. In the experiments, out of 20 MPEs generated by BAMSS, BAMSS-GA was able to identify 7 plausible network solutions resulting in less amount of information needed for sensemaking and reducing the inference search space by 7/20 (35%). The results reveal that GA can be used successfully in Bayesian information fusion as a search technique to identify those significant posterior probabilities useful for sensemaking. BAMSS-GA was also more robust in overcoming the problem of bounded search that is a constraint to Bayesian clustering and inference state space in BAMSS

    Regularized model learning in EDAs for continuous and multi-objective optimization

    Get PDF
    Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods
    corecore