1,494 research outputs found

    Probabilistic Programming Concepts

    Full text link
    A multitude of different probabilistic programming languages exists today, all extending a traditional programming language with primitives to support modeling of complex, structured probability distributions. Each of these languages employs its own probabilistic primitives, and comes with a particular syntax, semantics and inference procedure. This makes it hard to understand the underlying programming concepts and appreciate the differences between the different languages. To obtain a better understanding of probabilistic programming, we identify a number of core programming concepts underlying the primitives used by various probabilistic languages, discuss the execution mechanisms that they require and use these to position state-of-the-art probabilistic languages and their implementation. While doing so, we focus on probabilistic extensions of logic programming languages such as Prolog, which have been developed since more than 20 years

    Shared Memory Parallel Subgraph Enumeration

    Full text link
    The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration has applications in many fields, including biochemistry and social networks, and interestingly the fastest algorithms for solving the problem for biochemical inputs are sequential. Since they depend on depth-first tree traversal, an efficient parallelization is far from trivial. Nevertheless, since important applications produce data sets with increasing difficulty, parallelism seems beneficial. We thus present here a shared-memory parallelization of the state-of-the-art subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs) by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing and our implementation demonstrates a significant speedup on real-world biochemical data---despite a highly irregular data access pattern. We also improve RI-DS by pruning the search space better; this further improves the empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017

    Probabilistic Logic Programming with Beta-Distributed Random Variables

    Full text link
    We enable aProbLog---a probabilistic logical programming approach---to reason in presence of uncertain probabilities represented as Beta-distributed random variables. We achieve the same performance of state-of-the-art algorithms for highly specified and engineered domains, while simultaneously we maintain the flexibility offered by aProbLog in handling complex relational domains. Our motivation is that faithfully capturing the distribution of probabilities is necessary to compute an expected utility for effective decision making under uncertainty: unfortunately, these probability distributions can be highly uncertain due to sparse data. To understand and accurately manipulate such probability distributions we need a well-defined theoretical framework that is provided by the Beta distribution, which specifies a distribution of probabilities representing all the possible values of a probability when the exact value is unknown.Comment: Accepted for presentation at AAAI 201

    On the Implementation of the Probabilistic Logic Programming Language ProbLog

    Get PDF
    The past few years have seen a surge of interest in the field of probabilistic logic learning and statistical relational learning. In this endeavor, many probabilistic logics have been developed. ProbLog is a recent probabilistic extension of Prolog motivated by the mining of large biological networks. In ProbLog, facts can be labeled with probabilities. These facts are treated as mutually independent random variables that indicate whether these facts belong to a randomly sampled program. Different kinds of queries can be posed to ProbLog programs. We introduce algorithms that allow the efficient execution of these queries, discuss their implementation on top of the YAP-Prolog system, and evaluate their performance in the context of large networks of biological entities.Comment: 28 pages; To appear in Theory and Practice of Logic Programming (TPLP

    Subgraph Pattern Matching over Uncertain Graphs with Identity Linkage Uncertainty

    Get PDF
    There is a growing need for methods which can capture uncertainties and answer queries over graph-structured data. Two common types of uncertainty are uncertainty over the attribute values of nodes and uncertainty over the existence of edges. In this paper, we combine those with identity uncertainty. Identity uncertainty represents uncertainty over the mapping from objects mentioned in the data, or references, to the underlying real-world entities. We propose the notion of a probabilistic entity graph (PEG), a probabilistic graph model that defines a distribution over possible graphs at the entity level. The model takes into account node attribute uncertainty, edge existence uncertainty, and identity uncertainty, and thus enables us to systematically reason about all three types of uncertainties in a uniform manner. We introduce a general framework for constructing a PEG given uncertain data at the reference level and develop highly efficient algorithms to answer subgraph pattern matching queries in this setting. Our algorithms are based on two novel ideas: context-aware path indexing and reduction by join-candidates, which drastically reduce the query search space. A comprehensive experimental evaluation shows that our approach outperforms baseline implementations by orders of magnitude

    DNF Sampling for ProbLog Inference

    Full text link
    Inference in probabilistic logic languages such as ProbLog, an extension of Prolog with probabilistic facts, is often based on a reduction to a propositional formula in DNF. Calculating the probability of such a formula involves the disjoint-sum-problem, which is computationally hard. In this work we introduce a new approximation method for ProbLog inference which exploits the DNF to focus sampling. While this DNF sampling technique has been applied to a variety of tasks before, to the best of our knowledge it has not been used for inference in probabilistic logic systems. The paper also presents an experimental comparison with another sampling based inference method previously introduced for ProbLog.Comment: Online proceedings of the Joint Workshop on Implementation of Constraint Logic Programming Systems and Logic-based Methods in Programming Environments (CICLOPS-WLPE 2010), Edinburgh, Scotland, U.K., July 15, 201
    corecore