Search CORE

9,005 research outputs found

On the Implementation of the Probabilistic Logic Programming Language ProbLog

Author: ANGELIKA KIMMIG
Bachmair
BART DEMOEN
Cussens
Dalvi
Dantsin
De Raedt
De Raedt
Getoor
Graf
Gutmann
Ishihata
Kimmig
Kimmig
LUC DE RAEDT
Muggleton
RICARDO ROCHA
Riguzzi
Santos
Santos Costa
Santos Costa
Sato
Sato
VÍTOR SANTOS COSTA
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2009
Field of study

The past few years have seen a surge of interest in the field of probabilistic logic learning and statistical relational learning. In this endeavor, many probabilistic logics have been developed. ProbLog is a recent probabilistic extension of Prolog motivated by the mining of large biological networks. In ProbLog, facts can be labeled with probabilities. These facts are treated as mutually independent random variables that indicate whether these facts belong to a randomly sampled program. Different kinds of queries can be posed to ProbLog programs. We introduce algorithms that allow the efficient execution of these queries, discuss their implementation on top of the YAP-Prolog system, and evaluate their performance in the context of large networks of biological entities.Comment: 28 pages; To appear in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

Lirias

CiteSeerX

Crossref

Repositório Aberto da Universidade do Porto

Time-Aware Probabilistic Knowledge Graphs

Author: Chekol Melisachew Wudage
Stuckenschmidt Heiner
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Symposium on Temporal Representation and Reasoning (TIME 2019)
Publication date: 01/01/2019
Field of study

The emergence of open information extraction as a tool for constructing and expanding knowledge graphs has aided the growth of temporal data, for instance, YAGO, NELL and Wikidata. While YAGO and Wikidata maintain the valid time of facts, NELL records the time point at which a fact is retrieved from some Web corpora. Collectively, these knowledge graphs (KG) store facts extracted from Wikipedia and other sources. Due to the imprecise nature of the extraction tools that are used to build and expand KG, such as NELL, the facts in the KG are weighted (a confidence value representing the correctness of a fact). Additionally, NELL can be considered as a transaction time KG because every fact is associated with extraction date. On the other hand, YAGO and Wikidata use the valid time model because they maintain facts together with their validity time (temporal scope). In this paper, we propose a bitemporal model (that combines transaction and valid time models) for maintaining and querying bitemporal probabilistic knowledge graphs. We study coalescing and scalability of marginal and MAP inference. Moreover, we show that complexity of reasoning tasks in atemporal probabilistic KG carry over to the bitemporal setting. Finally, we report our evaluation results of the proposed model

MAnnheim DOCument Server

Dagstuhl Research Online Publication Server

Structurally Tractable Uncertain Data

Author: Abiteboul S.
Abiteboul S.
Agrawal R.
Amarilli A.
Carlson A.
Courcelle B.
Deutch D.
Dong X.
Galárraga L.
Gottlob G.
Lauritzen S. L.
Maniu S.
Raedt L. D.
Robertson N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/07/2015
Field of study

Many data management applications must deal with data which is uncertain, incomplete, or noisy. However, on existing uncertain data representations, we cannot tractably perform the important query evaluation tasks of determining query possibility, certainty, or probability: these problems are hard on arbitrary uncertain input instances. We thus ask whether we could restrict the structure of uncertain data so as to guarantee the tractability of exact query evaluation. We present our tractability results for tree and tree-like uncertain data, and a vision for probabilistic rule reasoning. We also study uncertainty about order, proposing a suitable representation, and study uncertain data conditioned by additional observations.Comment: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium 201

arXiv.org e-Print Archive

Crossref

Integrating and Ranking Uncertain Scientific Data

Author: Detwiler Landon T
Gatterbauer Wolfgang
Louie Brenton
Suciu Dan
Tarczy-Hornoch Peter
Publication venue
Publication date: 01/01/2008
Field of study

Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

CiteSeerX

Crossref

University of Washington Structural Informatics Group Publications

Quasi-SLCA based Keyword Query Processing over Probabilistic XML Data

Author: Li Jianxin
Liu Chengfei
Yu Jeffrey Xu
Zhou Rui
Publication venue
Publication date: 10/01/2013
Field of study

The probabilistic threshold query is one of the most common queries in uncertain databases, where a result satisfying the query must be also with probability meeting the threshold requirement. In this paper, we investigate probabilistic threshold keyword queries (PrTKQ) over XML data, which is not studied before. We first introduce the notion of quasi-SLCA and use it to represent results for a PrTKQ with the consideration of possible world semantics. Then we design a probabilistic inverted (PI) index that can be used to quickly return the qualified answers and filter out the unqualified ones based on our proposed lower/upper bounds. After that, we propose two efficient and comparable algorithms: Baseline Algorithm and PI index-based Algorithm. To accelerate the performance of algorithms, we also utilize probability density function. An empirical study using real and synthetic data sets has verified the effectiveness and the efficiency of our approaches

arXiv.org e-Print Archive

Deakin Research Online

Adelaide Research & Scholarship

Swinburne Research Bank

Infinite Probabilistic Databases

Author: Grohe Martin
Lindner Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server