859 research outputs found

    Impact of Bayesian network model structure on the accuracy of medical diagnostic systems

    Get PDF
    While Bayesian network models may contain a handful of numerical parameters that are important for their quality, several empirical studies have confirmed that overall precision of their probabilities is not crucial. In this paper, we study the impact of the structure of a Bayesian network on the precision of medical diagnostic systems. We show that also the structure is not that important - diagnostic accuracy of several medical diagnostic models changes minimally when we subject their structures to such transformations as arc removal and arc reversal. © 2014 Springer International Publishing

    Cost-Sensitive Decision Tree with Multiple Resource Constraints

    Get PDF
    Resource constraints are commonly found in classification tasks. For example, there could be a budget limit on implementation and a deadline for finishing the classification task. Applying the top-down approach for tree induction in this situation may have significant drawbacks. In particular, it is difficult, especially in an early stage of tree induction, to assess an attribute’s contribution to improving the total implementation cost and its impact on attribute selection in later stages because of the deadline constraint. To address this problem, we propose an innovative algorithm, namely, the Cost-Sensitive Associative Tree (CAT) algorithm. Essentially, the algorithm first extracts and retains association classification rules from the training data which satisfy resource constraints, and then uses the rules to construct the final decision tree. The approach has advantages over the traditional top-down approach, first because only feasible classification rules are considered in the tree induction and, second, because their costs and resource use are known. In contrast, in the top-down approach, the information is not available for selecting splitting attributes. The experiment results show that the CAT algorithm significantly outperforms the top-down approach and adapts very well to available resources.Cost-sensitive learning, mining methods and algorithms, decision trees

    Impact of Quality of Bayesian Network Parameters on Accuracy of Medical Diagnostic Systems

    Get PDF
    While most knowledge engineers believe that the quality of results obtained by means of Bayesian networks is not too sensitive to imprecision in probabilities, this remains a conjecture with only modest empirical support. We summarize the results of several previously presented experiments involving Hepar II model, in which we manipulated the quality of the model's numerical parameters and checked the impact of these manipulations on the model's accuracy. The chief contribution of this paper are results of replicating our experiments on several medical diagnostic models derived from data sets available at the Irvine Machine Learning Repository. We show that the results of our experiments are qualitatively identical to those obtained earlier with Hepar II

    Interpretable Narrative Explanation for ML Predictors with LP: A Case Study for XAI

    Get PDF
    In the era of digital revolution, individual lives are going to cross and interconnect ubiquitous online domains and offline reality based on smart technologies\u2014discovering, storing, processing, learning, analysing, and predicting from huge amounts of environment-collected data. Sub-symbolic techniques, such as deep learning, play a key role there, yet they are often built as black boxes, which are not inspectable, interpretable, explainable. New research efforts towards explainable artificial intelligence (XAI) are trying to address those issues, with the final purpose of building understandable, accountable, and trustable AI systems\u2014still, seemingly with a long way to go. Generally speaking, while we fully understand and appreciate the power of sub-symbolic approaches, we believe that symbolic approaches to machine intelligence, once properly combined with sub-symbolic ones, have a critical role to play in order to achieve key properties of XAI such as observability, interpretability, explainability, accountability, and trustability. In this paper we describe an example of integration of symbolic and sub-symbolic techniques. First, we sketch a general framework where symbolic and sub-symbolic approaches could fruitfully combine to produce intelligent behaviour in AI applications. Then, we focus in particular on the goal of building a narrative explanation for ML predictors: to this end, we exploit the logical knowledge obtained translating decision tree predictors into logical programs

    Distinguishing cause from effect using observational data: methods and benchmarks

    Get PDF
    The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y. An example is to decide whether altitude causes temperature, or vice versa, given only joint measurements of both variables. Even under the simplifying assumptions of no confounding, no feedback loops, and no selection bias, such bivariate causal discovery problems are challenging. Nevertheless, several approaches for addressing those problems have been proposed in recent years. We review two families of such methods: Additive Noise Methods (ANM) and Information Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs that consists of data for 100 different cause-effect pairs selected from 37 datasets from various domains (e.g., meteorology, biology, medicine, engineering, economy, etc.) and motivate our decisions regarding the "ground truth" causal directions of all pairs. We evaluate the performance of several bivariate causal discovery methods on these real-world benchmark data and in addition on artificially simulated data. Our empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions. One of the best performing methods overall is the additive-noise method originally proposed by Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of 0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning Researc
    corecore