1,218 research outputs found

    Data Cube Approximation and Mining using Probabilistic Modeling

    Get PDF
    On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches

    From Statistical Relational to Neurosymbolic Artificial Intelligence: a Survey

    Full text link
    This survey explores the integration of learning and reasoning in two different fields of artificial intelligence: neurosymbolic and statistical relational artificial intelligence. Neurosymbolic artificial intelligence (NeSy) studies the integration of symbolic reasoning and neural networks, while statistical relational artificial intelligence (StarAI) focuses on integrating logic with probabilistic graphical models. This survey identifies seven shared dimensions between these two subfields of AI. These dimensions can be used to characterize different NeSy and StarAI systems. They are concerned with (1) the approach to logical inference, whether model or proof-based; (2) the syntax of the used logical theories; (3) the logical semantics of the systems and their extensions to facilitate learning; (4) the scope of learning, encompassing either parameter or structure learning; (5) the presence of symbolic and subsymbolic representations; (6) the degree to which systems capture the original logic, probabilistic, and neural paradigms; and (7) the classes of learning tasks the systems are applied to. By positioning various NeSy and StarAI systems along these dimensions and pointing out similarities and differences between them, this survey contributes fundamental concepts for understanding the integration of learning and reasoning.Comment: To appear in Artificial Intelligence. Shorter version at IJCAI 2020 survey track, https://www.ijcai.org/proceedings/2020/0688.pd

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Computing the decomposable entropy of belief-function graphical models

    Get PDF
    In 2018, Jiroušek and Shenoy proposed a definition of entropy for Dempster-Shafer (D-S) belief functions called decomposable entropy (d-entropy). This paper provides an algorithm for computing the d-entropy of directed graphical D-S belief function models. We illustrate the algorithm using Almond's Captain's Problem example. For belief function undirected graphical models, assuming that the set of belief functions in the model is non-informative, the belief functions are distinct. We illustrate this using Haenni-Lehmann's Communication Network problem. As the joint belief function for this model is quasi-consonant, it follows from a property of d-entropy that the d-entropy of this model is zero, and no algorithm is required. For a class of undirected graphical models, we provide an algorithm for computing the d-entropy of such models. Finally, the d-entropy coincides with Shannon's entropy for the probability mass function of a single random variable and for a large multi-dimensional probability distribution expressed as a directed acyclic graph model called a Bayesian network. We illustrate this using Lauritzen-Spiegelhalter's Chest Clinic example represented as a belief-function directed graphical model

    Entropy of regular timed languages

    Get PDF
    For timed languages, we define size measures: volume for languages with a fixed finite number of events, and entropy (growth rate) as asymptotic measure for an unbounded number of events. These measures can be used for quantitative comparison of languages, and the entropy can be viewed as information contents of a timed language. For languages accepted by deterministic timed automata, we give exact formulas for volumes. We show that automata with non-vanishing entropy ("thick") have a normal (non-Zeno, discretizable etc.) behavior for typical runs. Next, we characterize the entropy, using methods of functional analysis, as the logarithm of the leading eigenvalue (spectral radius) of a positive integral operator. We devise a couple of methods to compute the entropy: a symbolical one for so-called "1 1 ⁄2-clock" automata, and a numerical one (with a guarantee of convergence)
    corecore