43,892 research outputs found
A Formal Account of the Open Provenance Model
On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The Open Provenance Model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems. Underpinning OPM is a notion of directed graph, where nodes represent data products and processes involved in past computations, and edges represent dependencies between them; it is complemented by graphical inference rules allowing new dependencies to be derived. Until now, however, the OPM model was a purely syntactical endeavor. The present paper extends OPM graphs with an explicit distinction between precise and imprecise edges. Then a formal semantics for the thus enriched OPM graphs is proposed, by viewing OPM graphs as temporal theories on the temporal events represented in the graph. The original OPM inference rules are scrutinized in view of the semantics and found to be sound but incomplete. An extended set of graphical rules is provided and proved to be complete for inference. The paper concludes with applications of the formal semantics to inferencing in OPM graphs, operators on OPM graphs, and a formal notion of refinement among OPM graphs
A Provenance Tracking Model for Data Updates
For data-centric systems, provenance tracking is particularly important when
the system is open and decentralised, such as the Web of Linked Data. In this
paper, a concise but expressive calculus which models data updates is
presented. The calculus is used to provide an operational semantics for a
system where data and updates interact concurrently. The operational semantics
of the calculus also tracks the provenance of data with respect to updates.
This provides a new formal semantics extending provenance diagrams which takes
into account the execution of processes in a concurrent setting. Moreover, a
sound and complete model for the calculus based on ideals of series-parallel
DAGs is provided. The notion of provenance introduced can be used as a
subjective indicator of the quality of data in concurrent interacting systems.Comment: In Proceedings FOCLASA 2012, arXiv:1208.432
Enhancing Workflow with a Semantic Description of Scientific Intent
Peer reviewedPreprin
Causality and the semantics of provenance
Provenance, or information about the sources, derivation, custody or history
of data, has been studied recently in a number of contexts, including
databases, scientific workflows and the Semantic Web. Many provenance
mechanisms have been developed, motivated by informal notions such as
influence, dependence, explanation and causality. However, there has been
little study of whether these mechanisms formally satisfy appropriate policies
or even how to formalize relevant motivating concepts such as causality. We
contend that mathematical models of these concepts are needed to justify and
compare provenance techniques. In this paper we review a theory of causality
based on structural models that has been developed in artificial intelligence,
and describe work in progress on a causal semantics for provenance graphs.Comment: Workshop submissio
Dynamic Provenance for SPARQL Update
While the Semantic Web currently can exhibit provenance information by using
the W3C PROV standards, there is a "missing link" in connecting PROV to storing
and querying for dynamic changes to RDF graphs using SPARQL. Solving this
problem would be required for such clear use-cases as the creation of version
control systems for RDF. While some provenance models and annotation techniques
for storing and querying provenance data originally developed with databases or
workflows in mind transfer readily to RDF and SPARQL, these techniques do not
readily adapt to describing changes in dynamic RDF datasets over time. In this
paper we explore how to adapt the dynamic copy-paste provenance model of
Buneman et al. [2] to RDF datasets that change over time in response to SPARQL
updates, how to represent the resulting provenance records themselves as RDF in
a manner compatible with W3C PROV, and how the provenance information can be
defined by reinterpreting SPARQL updates. The primary contribution of this
paper is a semantic framework that enables the semantics of SPARQL Update to be
used as the basis for a 'cut-and-paste' provenance model in a principled
manner.Comment: Pre-publication version of ISWC 2014 pape
e-Social Science and Evidence-Based Policy Assessment : Challenges and Solutions
Peer reviewedPreprin
The lifecycle of provenance metadata and its associated challenges and opportunities
This chapter outlines some of the challenges and opportunities associated
with adopting provenance principles and standards in a variety of disciplines,
including data publication and reuse, and information sciences
- …