7 research outputs found

    Distribution of Mutual Information

    Full text link
    The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean, variance, skewness, and kurtosis, and non-informative priors. For the mean we also give an exact expression. Numerical issues and the range of validity are discussed.Comment: 8 page

    Robust Feature Selection by Mutual Information Distributions

    Full text link
    Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.Comment: 8 two-column page

    Robust Inference of Trees

    Full text link
    This paper is concerned with the reliable inference of optimal tree-approximations to the dependency structure of an unknown distribution generating data. The traditional approach to the problem measures the dependency strength between random variables by the index called mutual information. In this paper reliability is achieved by Walley's imprecise Dirichlet model, which generalizes Bayesian learning with Dirichlet priors. Adopting the imprecise Dirichlet model results in posterior interval expectation for mutual information, and in a set of plausible trees consistent with the data. Reliable inference about the actual tree is achieved by focusing on the substructure common to all the plausible trees. We develop an exact algorithm that infers the substructure in time O(m^4), m being the number of random variables. The new algorithm is applied to a set of data sampled from a known distribution. The method is shown to reliably infer edges of the actual tree even when the data are very scarce, unlike the traditional approach. Finally, we provide lower and upper credibility limits for mutual information under the imprecise Dirichlet model. These enable the previous developments to be extended to a full inferential method for trees.Comment: 26 pages, 7 figure

    Distribution of Mutual Information from Complete and Incomplete Data

    Full text link
    Mutual information is widely used, in a descriptive way, to measure the stochastic dependence of categorical random variables. In order to address questions such as the reliability of the descriptive value, one must consider sample-to-population inferential approaches. This paper deals with the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived. These approximations have a guaranteed accuracy level of the order O(1/n^3), where n is the sample size. Leading order approximations for the mean and the variance are derived in the case of incomplete samples. The derived analytical expressions allow the distribution of mutual information to be approximated reliably and quickly. In fact, the derived expressions can be computed with the same order of complexity needed for descriptive mutual information. This makes the distribution of mutual information become a concrete alternative to descriptive mutual information in many applications which would benefit from moving to the inductive side. Some of these prospective applications are discussed, and one of them, namely feature selection, is shown to perform significantly better when inductive mutual information is used.Comment: 26 pages, LaTeX, 5 figures, 4 table

    Dependency structure matrix, genetic algorithms, and effective recombination

    Get PDF
    In many different fields, researchers are often confronted by problems arising from complex systems. Simple heuristics or even enumeration works quite well on small and easy problems; however, to efficiently solve large and difficult problems, proper decomposition is the key. In this paper, investigating and analyzing interactions between components of complex systems shed some light on problem decomposition. By recognizing three bare-bones interactions-modularity, hierarchy, and overlap, facet-wise models arc developed to dissect and inspect problem decomposition in the context of genetic algorithms. The proposed genetic algorithm design utilizes a matrix representation of an interaction graph to analyze and explicitly decompose the problem. The results from this paper should benefit research both technically and scientifically. Technically, this paper develops an automated dependency structure matrix clustering technique and utilizes it to design a model-building genetic algorithm that learns and delivers the problem structure. Scientifically, the explicit interaction model describes the problem structure very well and helps researchers gain important insights through the explicitness of the procedure.This work was sponsored by Taiwan National Science Council under grant NSC97- 2218-E-002-020-MY3, U.S. Air Force Office of Scientific Research, Air Force Material Command, USAF, under grants FA9550-06-1-0370 and FA9550-06-1-0096, U.S. National Science Foundation under CAREER grant ECS-0547013, ITR grant DMR-03-25939 at Materials Computation Center, grant ISS-02-09199 at US National Center for Supercomputing Applications, UIUC, and the Portuguese Foundation for Science and Technology under grants SFRH/BD/16980/2004 and PTDC/EIA/67776/2006

    The posterior probability of Bayes nets with strong dependences

    No full text
    corecore