10 research outputs found

    Dirichlet Bayesian Network Scores and the Maximum Relative Entropy Principle

    Full text link
    A classic approach for learning Bayesian networks from data is to identify a maximum a posteriori (MAP) network structure. In the case of discrete Bayesian networks, MAP networks are selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the Bayesian Dirichlet equivalent uniform (BDeu) score from Heckerman et al (1995). The key properties of BDeu arise from its uniform prior over the parameters of each local distribution in the network, which makes structure learning computationally efficient; it does not require the elicitation of prior knowledge from experts; and it satisfies score equivalence. In this paper we will review the derivation and the properties of BD scores, and of BDeu in particular, and we will link them to the corresponding entropy estimates to study them from an information theoretic perspective. To this end, we will work in the context of the foundational work of Giffin and Caticha (2007), who showed that Bayesian inference can be framed as a particular case of the maximum relative entropy principle. We will use this connection to show that BDeu should not be used for structure learning from sparse data, since it violates the maximum relative entropy principle; and that it is also problematic from a more classic Bayesian model selection perspective, because it produces Bayes factors that are sensitive to the value of its only hyperparameter. Using a large simulation study, we found in our previous work (Scutari, 2016) that the Bayesian Dirichlet sparse (BDs) score seems to provide better accuracy in structure learning; in this paper we further show that BDs does not suffer from the issues above, and we recommend to use it for sparse data instead of BDeu. Finally, will show that these issues are in fact different aspects of the same problem and a consequence of the distributional assumptions of the prior.Comment: 20 pages, 4 figures; extended version submitted to Behaviormetrik

    Improvement of CB & BC Algorithms (CB* Algorithm) for Learning Structure of Bayesian Networks as Classifier in Data Mining

    Full text link

    Improvement of CB & BC Algorithms (CB* Algorithm) for Learning Structure of Bayesian Networks as Classifier in Data Mining

    Get PDF
    There are two categories of well-known approach (as basic principle of classification process) for learning structure of Bayesian Network (BN) in data mining (DM): scoring-based and constraint-based algorithms. Inspired by those approaches, we present a new CB* algorithm that is developed by considering four related algorithms: K2, PC, CB, and BC. The improvement obtained by our algorithm is derived from the strength of its primitives in the process of learning structure of BN. Specifically, CB* algorithm is appropriate for incomplete databases (having missing value), and without any prior information about node ordering

    Rational Coordination in Multi-Agent Environments

    Full text link
    We adopt the decision-theoretic principle of expected utility maximization as a paradigm for designing autonomous rational agents, and present a framework that uses this paradigm to determine the choice of coordinated action. We endow an agent with a specialized representation that captures the agent's knowledge about the environment and about the other agents, including its knowledge about their states of knowledge, which can include what they know about the other agents, and so on. This reciprocity leads to a recursive nesting of models. Our framework puts forth a representation for the recursive models and, under the assumption that the nesting of models is finite, uses dynamic programming to solve this representation for the agent's rational choice of action. Using a decision-theoretic approach, our work addresses concerns of agent decision-making about coordinated action in unpredictable situations, without imposing upon agents pre-designed prescriptions, or protocols, about standard rules of interaction. We implemented our method in a number of domains and we show results of coordination among our automated agents, among human-controlled agents, and among our agents coordinating with human-controlled agents.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/44002/1/10458_2004_Article_272540.pd

    Rational Communication in Multi-Agent Environments

    Full text link
    We address the issue of rational communicative behavior among autonomous self-interested agents that have to make decisions as to what to communicate, to whom, and how. Following decision theory, we postulate that a rational speaker should design a speech act so as to optimize the benefit it obtains as the result of the interaction. We quantify the gain in the quality of interaction in terms of the expected utility, and we present a framework that allows an agent to compute the expected utilities of various communicative actions. Our framework uses the Recursive Modeling Method as the specialized representation used for decision-making in a multi-agent environment. This representation includes information about the agent's state of knowledge, including the agent's preferences, abilities and beliefs about the world, as well as the beliefs the agent has about the other agents, the beliefs it has about the other agents' beliefs, and so on. Decision-theoretic pragmatics of a communicative act can be then defined as the transformation the act induces on the agent's state of knowledge about its decision-making situation. This transformation leads to a change in the quality of interaction, expressed in terms of the expected utilities of the agent's best actions before and after the communicative act. We analyze decision-theoretic pragmatics of a number of important kinds of communicative acts and investigate their expected utilities using examples. Finally, we report on the agreement between our method of message selection and messages that human subjects choose in various circumstances, and show an implementation and experimental validation of our framework in a simulated multi-agent environment.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/44016/1/10458_2004_Article_350961.pd

    Bayesian networks for spatio-temporal integrated catchment assessment

    Get PDF
    Includes abstract.Includes bibliographical references (leaves 181-203).In this thesis, a methodology for integrated catchment water resources assessment using Bayesian Networks was developed. A custom made software application that combines Bayesian Networks with GIS was used to facilitate data pre-processing and spatial modelling. Dynamic Bayesian Networks were implemented in the software for time-series modelling

    Principled computational methods for the validation discovery of genetic regulatory networks

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 193-206).As molecular biology continues to evolve in the direction of high-throughput collection of data, it has become increasingly necessary to develop computational methods for analyzing observed data that are at once both sophisticated enough to capture essential features of biological phenomena and at the same time approachable in terms of their application. We demonstrate how graphical models, and Bayesian networks in particular, can be used to model genetic regulatory networks. These methods are well-suited to this problem owing to their ability to model more than pair-wise relationships between variables, their ability to guard against over-fitting, and their robustness in the face of noisy data. Moreover, Bayesian network models can be scored in a principled manner in the presence of both genomic expression and location data. We develop methods for extending Bayesian network semantics to include edge annotations that allow us to model statistical dependencies between biological factors with greater refinement. We derive principled methods for scoring these annotated Bayesian networks. Using these models in the presence of genomic expression data requires suitable methods for the normalization and discretization of this data.(cont.) We present novel methods appropriate to this context for performing each of these operations. With these elements in place, we are able to apply our scoring framework to both validate models of regulatory networks in comparison with one another and discover networks using heuristic search methods. To demonstrate the utility of this framework for the elucidation of genetic regulatory networks, we apply these methods in the context of the well-understood galactose regulatory system and the less well-understood pheromone response system in yeast. We demonstrate how genomic expression and location data can be combined in a principled manner to enable the induction of models not readily discovered if the data sources are considered in isolation.by Alexander John Hartemink.Ph.D

    A comparison of scientific and engineering criteria for Bayesian model selection

    No full text
    Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate ” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial. Keywords: model selection, model averaging, Bayesian selection criteri
    corecore