Search CORE

2,476 research outputs found

Causality and the semantics of provenance

Provenance, or information about the sources, derivation, custody or history of data, has been studied recently in a number of contexts, including databases, scientific workflows and the Semantic Web. Many provenance mechanisms have been developed, motivated by informal notions such as influence, dependence, explanation and causality. However, there has been little study of whether these mechanisms formally satisfy appropriate policies or even how to formalize relevant motivating concepts such as causality. We contend that mathematical models of these concepts are needed to justify and compare provenance techniques. In this paper we review a theory of causality based on structural models that has been developed in artificial intelligence, and describe work in progress on a causal semantics for provenance graphs.Comment: Workshop submissio

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

The Open Provenance Model

Author: Freire Juliana
Futrelle Joe
McGrath Robert
Moreau Luc
Myers Jim
Paulson Patrick
Publication venue: s.n.
Publication date: 18/12/2007
Field of study

Southampton (e-Prints Soton)

From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

Author: Bertossi Leopoldo
Salimi Babak
Publication venue
Publication date: 13/12/2014
Field of study

In this work we establish and investigate connections between causes for query answers in databases, database repairs wrt. denial constraints, and consistency-based diagnosis. The first two are relatively new research areas in databases, and the third one is an established subject in knowledge representation. We show how to obtain database repairs from causes, and the other way around. Causality problems are formulated as diagnosis problems, and the diagnoses provide causes and their responsibilities. The vast body of research on database repairs can be applied to the newer problems of computing actual causes for query answers and their responsibilities. These connections, which are interesting per se, allow us, after a transition -inspired by consistency-based diagnosis- to computational problems on hitting sets and vertex covers in hypergraphs, to obtain several new algorithmic and complexity results for database causality.Comment: To appear in Theory of Computing Systems. By invitation to special issue with extended papers from ICDT 2015 (paper arXiv:1412.4311

arXiv.org e-Print Archive

Carleton University's Institutional Repository

Dagstuhl Research Online Publication Server

A Formal Account of the Open Provenance Model

Author: Activity C Provenance Incubator
C
Dey Saumen
Hull Richard
Jacobs Ian
Jan Van Den Bussche
Luc Moreau
Mattern Friedemann
Moreau Luc
Myers James
Natalia Kwasnikowska
Robert
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/05/2015
Field of study

On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The Open Provenance Model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems. Underpinning OPM is a notion of directed graph, where nodes represent data products and processes involved in past computations, and edges represent dependencies between them; it is complemented by graphical inference rules allowing new dependencies to be derived. Until now, however, the OPM model was a purely syntactical endeavor. The present paper extends OPM graphs with an explicit distinction between precise and imprecise edges. Then a formal semantics for the thus enriched OPM graphs is proposed, by viewing OPM graphs as temporal theories on the temporal events represented in the graph. The original OPM inference rules are scrutinized in view of the semantics and found to be sound but incomplete. An extended set of graphical rules is provided and proved to be complete for inference. The paper concludes with applications of the formal semantics to inferencing in OPM graphs, operators on OPM graphs, and a formal notion of refinement among OPM graphs

Southampton (e-Prints Soton)

Crossref

King's Research Portal

On the Limitations of Provenance for Queries With Difference

Author: Amsterdamer Yael
Deutch Daniel
Tannen Val
Publication venue
Publication date: 01/01/2011
Field of study

The annotation of the results of database transformations was shown to be very effective for various applications. Until recently, most works in this context focused on positive query languages. The provenance semirings is a particular approach that was proven effective for these languages, and it was shown that when propagating provenance with semirings, the expected equivalence axioms of the corresponding query languages are satisfied. There have been several attempts to extend the framework to account for relational algebra queries with difference. We show here that these suggestions fail to satisfy some expected equivalence axioms (that in particular hold for queries on "standard" set and bag databases). Interestingly, we show that this is not a pitfall of these particular attempts, but rather every such attempt is bound to fail in satisfying these axioms, for some semirings. Finally, we show particular semirings for which an extension for supporting difference is (im)possible.Comment: TAPP 201

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Enhancing Workflow with a Semantic Description of Scientific Intent

Author: Edwards Peter
Gotts Nick
Pignotti Edoardo
Polhill Gary
Publication venue: 'Elsevier BV'
Publication date: 10/05/2011
Field of study

Peer reviewedPreprin

Aberdeen University Research

Crossref

Dynamic Provenance for SPARQL Update

Author: C. Gutierrez
G. Flouris
H. Halpin
J. Perèz
J.J. Carroll
L. Moreau
L. Moreau
N. Lopes
O. Udrea
P. Buneman
P. Buneman
R. Horne
R.T. Snodgrass
T.J. Green
V. Papavassiliou
Y. Theoharis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

While the Semantic Web currently can exhibit provenance information by using the W3C PROV standards, there is a "missing link" in connecting PROV to storing and querying for dynamic changes to RDF graphs using SPARQL. Solving this problem would be required for such clear use-cases as the creation of version control systems for RDF. While some provenance models and annotation techniques for storing and querying provenance data originally developed with databases or workflows in mind transfer readily to RDF and SPARQL, these techniques do not readily adapt to describing changes in dynamic RDF datasets over time. In this paper we explore how to adapt the dynamic copy-paste provenance model of Buneman et al. [2] to RDF datasets that change over time in response to SPARQL updates, how to represent the resulting provenance records themselves as RDF in a manner compatible with W3C PROV, and how the provenance information can be defined by reinterpreting SPARQL updates. The primary contribution of this paper is a semantic framework that enables the semantics of SPARQL Update to be used as the basis for a 'cut-and-paste' provenance model in a principled manner.Comment: Pre-publication version of ISWC 2014 pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

A Provenance Tracking Model for Data Updates

Author: Ciobanu Gabriel
Horne Ross
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2012
Field of study

For data-centric systems, provenance tracking is particularly important when the system is open and decentralised, such as the Web of Linked Data. In this paper, a concise but expressive calculus which models data updates is presented. The calculus is used to provide an operational semantics for a system where data and updates interact concurrently. The operational semantics of the calculus also tracks the provenance of data with respect to updates. This provides a new formal semantics extending provenance diagrams which takes into account the execution of processes in a concurrent setting. Moreover, a sound and complete model for the calculus based on ideals of series-parallel DAGs is provided. The notion of provenance introduced can be used as a subjective indicator of the quality of data in concurrent interacting systems.Comment: In Proceedings FOCLASA 2012, arXiv:1208.432

arXiv.org e-Print Archive

University of Strathclyde Institutional Repository

Directory of Open Access Journals

DR-NTU (Digital Repository of NTU)